关于nvidia:模型推理deepstream60-部署-yolov3-和-yolov4-教程

欢送关注我的公众号 [极智视界]，回复001获取Google编程标准

O_o >_< o_O O_o ~_~ o_O

大家好，我是极智视界，本文介绍了应用 deepstream6.0 部署 yolov3 和 yolov4 的办法。

Yolo 系列是工程中利用非常宽泛的指标检测算法，特地是从 yolov3 开始，逐渐的进化，到 yolov4、yolov5 等，工程的接受度越来越高。而 deepstream 是英伟达提出的一套减速深度学习落地的 pipeline 利用，那么当 deepstream 遇到 yolo，会擦出什么样的火花呢，让咱们来看。

对于 deepstream 的装置教程，能够查阅我之前写的几篇：《【教训分享】ubuntu 装置 deepstream6.0》、《【教训分享】ubuntu 装置 deepstream5.1》。

先来看下 deepstream6.0 source 的目录构造：

apps
- apps-common
- audio_apps
- sample_apps：例程，如 deepstream-app、deepstream-test1…
gst-plugins：gstreamer 插件
include：头
libs：库
objectDetector_FasterRCNN：FasterRCNN 示例
objectDetector_SSD：SSD 示例
objectDetector_Yolo：YOLO 示例
tools: 日志相干

1、deepstream6.0 部署 yolov3

通过上述的 objectDetector_Yolo 工程来跑 yolov3，在 objectDetector_Yolo 工程里次要关注以下几个模块：

nvdsinfer_custom_impl_Yolo：yolov3 工程实现代码；
- nvdsinfer_yolo_engine.cpp：解析模型、生成引擎
- nvdsparsebbox_Yolo.cpp：输入层的解析函数，解析指标检测框
- trt_utils.cpp 和 trt_utils.h：结构 TensorRT网络的工具类的接口和实现
- yolo.cpp 和 yolo.h：生成 yolo 引擎的接口和实现
- yoloPlugins.cpp 和 yoloPlugins.h：YoloLayerV3 and YoloLayerV3PluginCreator 的接口和实现
- kernels.cu：cuda核底层实现
config_infer_xxx_.txt：模型的配置；
deepstream_app_config_xxx.txt：Gstreamer nvinfer 插件的配置文件；
xxx.cfg、xxx.weights：模型文件；

有以上这些就够了，上面开始。

1.1 下载模型文件

deepstream6.0 SDK 中是没有 yolov3 的模型文件的，须要自行下载，给出传送。

yolov3.cfg：https://github.com/pjreddie/d…；

yolov3.weights：https://link.zhihu.com/?targe…；

这里多说一句，如果你有 TensorRT 的 yolov3.engine 的话，就不须要原始模型文件了，如果没有 .engine 的话，其实会依据原始文件学生成 .engine。

1.2 配置 config_infer_primary_yolov3.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=yolov3.cfg
model-file=yolov3.weights
labelfile-path=labels.txt
int8-calib-file=yolov3-calibration.table.trt7.0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=80
gie-unique-id=1
network-type=0
is-classifier=0
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.3
threshold=0.7

1.3 配置 deepstream_app_config_yolov3.txt

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=1
type=3
uri=file://../../samples/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=1
batched-push-timeout=40000
width=1920
height=1080
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
#model-engine-file=model_b1_gpu0_int8.engine
labelfile-path=labels.txt
batch-size=1
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=2
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tracker]
enable=1
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=../../samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
gpu-id=0
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1

[tests]
file-loop=0

1.4 工程编译

进入到 /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo：

cd /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo

顺次执行上面两条命令，编译生成 .so 文件：

export CUDA_VER=11.4    # 设置与设施雷同的CUDA版本

或者在 /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/Makefile 中批改：

而后执行编译

make -C nvdsinfer_custom_impl_Yolo

编译后会生产动静库文件，生成了 libnvdsinfer_custom_impl_Yolo.so 动静库文件。

1.5 执行

deepstream-app -c deepstream_app_config_yoloV3.txt

这里实现了 deepstream6.0 Yolov3 的部署。

2、deepstream6.0 部署 yolov4

这里以不同的形式来部署一下 yolov4，即间接调用 TensorRT Engine，而不是从原始模型导入。

2.1 应用 darknet2onnx2TRT 生成 yolov4.engine

下载 yolov4 darknet 原始权重，给出百度网盘传送：

https://pan.baidu.com/s/1dAGEW8cm-dqK14TbhhVetA     Extraction code:dm5b

clone 模型转换工程：

git clone https://github.com/Tianxiaomo/pytorch-YOLOv4.git Yolov42TRT

开始模型转换：

cd Yolov42TRT

# darknet2onnx
python demo_darknet2onnx.py ./cfg/yolov4.cfg ./cfg/yolov4.weights ./data/dog.jpg 1

# onnx2trt
trtexec --onnx=./yolov4_1_3_608_608_static.onnx --fp16 --saveEngine=./yolov4.engine --device=0

这样就会生成 yolov4.engine。

2.2 deepstream yolov4 推理工程配置

clone deepstream yolov4 推理工程：

git clone https://github.com/NVIDIA-AI-IOT/yolov4_deepstream.git

cd yolov4_deepstream/deepstream_yolov4

配置 config_infer_primary_yoloV4.txt：

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
model-engine-file=yolov4.engine
labelfile-path=labels.txt
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
network-type=0
is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

配置 deepstream_app_config_yoloV4.txt：

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=1
type=3
uri=file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=3
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
container=1
codec=1
output-file=yolov4.mp4

[osd]
enable=1
gpu-id=0
border-width=1
text-size=12
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=1
batched-push-timeout=40000
width=1280
height=720
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
model-engine-file=yolov4.engine
labelfile-path=labels.txt
batch-size=1

bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV4.txt

[tracker]
enable=0
tracker-width=512
tracker-height=320
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so

[tests]
file-loop=0

把 2.1 转换生成的 yolov4.engine 拷贝到 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream。

2.3 工程编译

进入到 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream：

cd /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream

顺次执行上面两条命令，编译生成 .so 文件：

export CUDA_VER=11.4    # 设置与设施雷同的CUDA版本

或者在 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream/nvdsinfer_custom_impl_Yolo/Makefile 中批改：

而后执行编译

make -C nvdsinfer_custom_impl_Yolo

编译后会生产动静库文件，生成了 libnvdsinfer_custom_impl_Yolo.so 动静库文件。

2.4 执行

deepstream-app -c deepstream_app_config_yoloV4.txt

这里实现了 deepstream6.0 Yolov4 的部署。

以上分享了 deepstream6.0 部署 yolov3 和 yolov4 的办法，心愿我的分享会对你的学习有一点帮忙。

【公众号传送】
《【模型推理】deepstream6.0 部署 yolov3 和 yolov4 教程》

关于nvidia:模型推理deepstream60-部署-yolov3-和-yolov4-教程

1、deepstream6.0 部署 yolov3

1.1 下载模型文件

1.2 配置 config_infer_primary_yolov3.txt

1.3 配置 deepstream_app_config_yolov3.txt

1.4 工程编译

1.5 执行

2、deepstream6.0 部署 yolov4

2.1 应用 darknet2onnx2TRT 生成 yolov4.engine

2.2 deepstream yolov4 推理工程配置

2.3 工程编译

2.4 执行

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于nvidia:模型推理deepstream60-部署-yolov3-和-yolov4-教程

1、deepstream6.0 部署 yolov3

1.1 下载模型文件

1.2 配置 config_infer_primary_yolov3.txt

1.3 配置 deepstream_app_config_yolov3.txt

1.4 工程编译

1.5 执行

2、deepstream6.0 部署 yolov4

2.1 应用 darknet2onnx2TRT 生成 yolov4.engine

2.2 deepstream yolov4 推理工程配置

2.3 工程编译

2.4 执行

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

发表回复取消回复