关于深度学习:用树莓派4b构建深度学习应用四PyTorch篇

上回咱们装置了 OpenCV 4.4，置信对源码编译库文件有了肯定的理解，这篇咱们进一步在树莓派上编译并装置 Pytorch 的最新版本。

PyTorch 1.6 版本减少了许多新的 API、用于性能改良和性能剖析的工具、以及对基于分布式数据并行 (Distributed Data Parallel, DDP) 和基于近程过程调用 (Remote Procedure Call, RPC) 的分布式训练的重大更新。局部更新亮点包含：

原生反对主动混合精度训练(AMP, automatic mixed-precision training)，只需减少几行新代码就能够进步大型模型训练 50-60% 的速度。
为 tensor-aware 减少对 TensorPipe 的原生反对
在前端 API 减少了对 complex tensor 的反对
新的剖析工具提供了张量级的内存耗费信息
针对分布式数据并行训练和近程过程调用的多项改良和新性能

编译 torch 须要破费大量的内存，在低于 2g 或以下内存的树莓派上，能够通过减少虚拟内存来避免 OOM，4g 或 8g 的版本的树莓派可跳过这步。

sudo nano /etc/dphys-swapfil

设置 4g 的替换内存，文件内容如下：

# /etc/dphys-swapfile - user settings for dphys-swapfile package
# author Neil Franklin, last modification 2010.05.05
# copyright ETH Zuerich Physics Departement
#   use under either modified/non-advertising BSD or GPL license

# this file is sourced with . so full normal sh syntax applies

# the default settings are added as commented out CONF_*=* lines
# where we want the swapfile to be, this is the default
#CONF_SWAPFILE=/var/swap

# set size to absolute value, leaving empty (default) then uses computed value
#   you most likely don't want this, unless you have an special disk situation
CONF_SWAPSIZE=4096

保留退出，重启服务失效。

sudo service dphys-swapfile restart

查看一下 swap 是否已调整。

swapon -s

首先装置一些编译须要的依赖库：

sudo apt-get install libopenblas-dev cython3 libatlas-base-dev m4 libblas-dev cmake
sudo apt-get install python3-dev python3-yaml python3-setuptools python3-wheel python3-pillow python3-numpy

deactivate   # 退出之前 OpenCV 的虚拟环境
# 创立新的虚拟环境
virtualenv -p python3 ~/my_envs/pytorch
source ~/my_envs/pytorch/bin/activate

export NO_CUDA=1
export NO_DISTRIBUTED=1
export NO_MKLDNN=1
export NO_NNPACK=1
export NO_QNNPACK=1

pip3 install numpy pyyaml

Tip:

务必确认一下虚拟环境下，曾经装置了 numpy。没有 numpy 的话也能胜利编译，然而编译进去的 PyTorch 不反对 numpy。PyTorch was compiled without NumPy support。

git clone https://github.com/pytorch/pytorch.git
cd pytorch
# 查问所要编译的版本
git branch -a
git tag
git checkout v1.6.0
git submodule update --init  --recursive
git submodule update --remote third_party/protobuf

python3 setup.py bdist_wheel

接下来就是历时 5 个多小时漫长的编译过程了，如果说之前编译 OpenCV 只是去喝杯咖啡就能回来持续，那编译 PyTorch 的工夫都够去好好睡上一觉了

顺便装置一个 CPU 温度和使用率工具 s -tui，来监测一下零碎状态。

sudo pip install s-tui  --ignore-installed
sudo s-tui

继续满负荷状态：

cd dist
pip3 install ./torch-1.6.0a0+b31f58d-cp37-cp37m-linux_armv7l.whl

看到如下信息，就代表装置胜利了。

git clone https://github.com/pytorch/vision.git

pytorch 1.6 对应的 torchvision 是 0.7 的版本，checkout 进去，并装置 PIL 反对。

pip3 install pillow
cd vision
git checkout v0.7.0-rc4
git submodule update --init   --recursive
python3 setup.py bdist_wheel

Tip:

编译如遇到以上错误信息，是因为源码中有两处变量类型谬误，须要用 size_t 强制类型转换一下。批改对应的 seekable_buffer.cpp 和 util.cpp 文件即可。

cd dist
pip3 install ./torchvision-0.7.0a0+78ed10c-cp37-cp37m-linux_armv7l.whl

搞定！

git clone https://github.com/ultralytics/yolov5

cd ~/my_envs/pytorch/lib/python3.7/site-packages
ln -s /usr/local/lib/python3.7/site-packages/cv2 cv2

Tip:

若要删除软链接，用 rm -rf ./cv2 即可，要留神的是千万别在最初增加 /。

pip install tqdm
pip install matplotlib
pip install scipy

测试用最小的模型 yolov5s 对两张图片进行指标检测，识别率还不错，但速度个别，一张 3.8 秒，一张 2.8 秒，大概 0.3fps，后续咱们能够比照一下 openvino 减速的成果。

cd yolov5
python3 detect.py --source ./inference/images/ --weights weights/yolov5s.pt --conf 0.5

到这里，树莓派里的 pytorch1.6 曾经能够失常工作了。

若想跳过简短的编译过程，能够间接下载 whl，而后用 pip install 进行装置即可。基于 python 3.7 的版本，除了 pytorch 1.6 + torchvision 0.7，我还编译了最新的 pytorch 1.7 + torchvision 0.8（装置时要留神版本匹配）。

咱们将开始装置 Tensorflow 的开发环境，
并运行一下 tensorflow lite，
看一下裸板树莓派推理的极限速度，
敬请期待 …

关于深度学习:用树莓派4b构建深度学习应用四PyTorch篇

前言

PyTorch 1.6 的新个性

减少替换内存（可选）

1. 批改配置文件

PyTorch 装置环境依赖

1. 装置依赖

2. 切换虚拟环境

编译装置 PyTorch

1. 设置配置项

2. 装置库文件

3. 下载源码及反对库

4. 生成 whl 安装包

5. 装置 PyTorch

编译装置 Torchvision

1. 下载源码

2. 抉择对应版本

3. 装置 TorchVision

运行 yolo v5

1. 克隆 yolov5 源码

2. 软链接到 OpenCV

3. 装置依赖库

4. 图像推理

材料下载

下一篇预报