背景
源码克隆于https://github.com/ultralytic...,刚克隆下来训练本人的数据集有点无从下手,看其github源码主页提供的教程也是摸不着头脑,钻研了两天搞定,记录一下,心愿能够帮忙到和我雷同状况的小伙伴,感觉有用的话能够点赞珍藏一下,有谬误评论区斧正!另外介绍下我是mac作为本地电脑,在有4块gpu的服务器上近程训练的。
1. 克隆yolov3源码,筹备本人数据集,筹备预训练权重文件
这里应用voc数据集,记得解压到你想放的地位
wget -c http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
预训练权重文件下载
wget -c https://pjreddie.com/media/files/darknet53.conv.74
我是间接把它解压至我的项目根目录下的,你们能够本人决定门路,然而前面会有些代码局部须要相应批改对应此门路。
我略微解释一下这个VOC2012上面几个文件夹,最初两个不必管,其实用不到,labels文件夹和lables.npy文件是我前面跑脚本生成的,你解压的是没有的,临时不论,Annotaions文件夹是寄存对应图片的xml文件,该文件是用标签语言来对图片的阐明,ImageSets里只关系Main这个文件夹,能够先把外面的文件删了,因为前面是要生成本人的txt文件的,外面记录的是对应训练集、验证集、测试集、训练验证集的图片名称。
2. 新增一个脚本
从数据集中随机生成训练集、验证集、测试集、训练验证集,并生成对应的txt文件,存到Voc2012/Imagesets/Main中,如图:
get_train_val_txt.py(脚本代码如下,自行拷贝,并适当批改)
import osimport random random.seed(0)xmlfilepath=r'./VOC2012/Annotations' # 改成你本人的Annotations门路saveBasePath=r"./VOC2012/ImageSets/Main/" # 改成你本人的ImageSets/Main门路 # 值能够本人取,决定你各个数据集的占比数trainval_percent=.8 train_percent=.7temp_xml = os.listdir(xmlfilepath)total_xml = []for xml in temp_xml: if xml.endswith(".xml"): total_xml.append(xml)num=len(total_xml) list=range(num) tv=int(num*trainval_percent) tr=int(tv*train_percent) trainval= random.sample(list,tv) train=random.sample(trainval,tr) ftrainval = open(os.path.join(saveBasePath,'trainval.txt'), 'w') ftest = open(os.path.join(saveBasePath,'test.txt'), 'w') ftrain = open(os.path.join(saveBasePath,'train.txt'), 'w') fval = open(os.path.join(saveBasePath,'val.txt'), 'w') for i in list: name=total_xml[i][:-4]+'\n' if i in trainval: ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name) else: ftest.write(name) ftrainval.close() ftrain.close() fval.close() ftest .close()
3. 在data文件夹里新建两个文件
新建voc2012.data文件,外面配置如下
classes=20 # 改成你数据集总共的类别train=data/train.txt valid=data/val.txtnames=data/voc2012.namesbackup=backup/
新建voc2012.names文件,因为我是voc数据集,所以配置如下,你须要改成你本人数据集的类别名字
personbirdcatcowdoghorsesheepaeroplanebicycleboatbuscarmotorbiketrainbottlechairdiningtablepottedplantsofatvmonitor
4. 新建voc_label.py脚本
# -*- coding: utf-8 -*-"""须要批改的中央:1. sets中替换为本人的数据集2. classes中替换为本人的类别"""import xml.etree.ElementTree as ETimport pickleimport osfrom os import listdir, getcwdfrom os.path import joinsets = [('2012','train'), ('2012','test'),('2012','val')] #替换为本人的数据集,格局[(year,image_id)]classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"] #批改为本人的类别名称# 转化为yolov3的数据格式提供的办法def convert(size, box): dw = 1./(size[0]) dh = 1./(size[1]) x = (box[0] + box[1])/2.0 - 1 y = (box[2] + box[3])/2.0 - 1 w = box[1] - box[0] h = box[3] - box[2] x = x*dw w = w*dw y = y*dh h = h*dh return (x,y,w,h)# 依据annotation的文件转换成yolov3数据格式def convert_annotation(year, image_id): in_file = open('VOC%s/Annotations/%s.xml'%(year, image_id)) # 改成本人annotation的xml文件地位 out_file = open('VOC%s/labels/%s.txt'%(year, image_id), 'w') # 存生成labels文件夹的门路,依据本人状况批改,我是打算存在./VOC2012/labels tree=ET.parse(in_file) root = tree.getroot() size = root.find('size') w = int(size.find('width').text) h = int(size.find('height').text) for obj in root.iter('object'): difficult = obj.find('difficult').text cls = obj.find('name').text if cls not in classes or int(difficult)==1: continue cls_id = classes.index(cls) xmlbox = obj.find('bndbox') b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) bb = convert((w,h), b) out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')wd = getcwd()# 上面代码门路要相应你本人的数据集寄存门路、labels寄存门路去批改for year,image_set in sets: if not os.path.exists('VOC%s/labels/'%(year)): os.makedirs('VOC%s/labels/'%(year)) image_ids = open('VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split() list_file = open('data/%s.txt'%(image_set), 'w') for image_id in image_ids: list_file.write('VOC%s/JPEGImages/%s.jpg\n'%(year, image_id)) convert_annotation(year, image_id) list_file.close()
运行该脚本,就会在我设定的VOC2012/lables下生成各个图片对应的txt文件
轻易关上一个文件,内容大抵如下
这里的内容其实就是yolov3意识的labels标示格局:类别所在的行数(对应voc2012.names里的行数,从0开始算) 归一化x坐标 归一化y坐标 归一化宽 归一化高
5. 阅览一下train.py文件和datasets.py文件,留神一下是否有些门路处要批改
我就是labels所在的中央要批改,没批改前始终报错labels找不到,这里说一下我本人的批改处
我本人把tian.py的attempt_download(weights)
代码正文了,因为我曾经下好了预训练权重文件,你们能够本人在train.py里搜寻这个函数,也给其正文
另外我的datasets.py文件labels文件门路批改如下:
因为我的labels里的txt文件门路是在VOC2012/labels/下的,我的图片文件是在VOC2012/JPEGImages/下的,所以应该是把JPEGImages替换成labels
6. 批改网络配置文件yolov3.cfg
次要批改这几个中央:
如果你有gpu,就像我一样批改
batch=64subdivisions=16
没有就都是1这两项
在在该文件搜寻classes,一共有三处,数值都批改成你数据集里的类别总数,并把classes所在层的上一层的[convolutional]网络的filters批改,批改值按此公式计算:$(classes+5)*3$,我把本人的cfg文件贴在上面
[net]# Testing#batch=1#subdivisions=1# Trainingbatch=64subdivisions=16width=608height=608channels=3momentum=0.9decay=0.0005angle=0saturation = 1.5exposure = 1.5hue=.1learning_rate=0.001burn_in=1000max_batches = 500200policy=stepssteps=400000,450000scales=.1,.1[convolutional]batch_normalize=1filters=32size=3stride=1pad=1activation=leaky# Downsample[convolutional]batch_normalize=1filters=64size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=32size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=64size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=128size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=64size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=64size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=256size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=512size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=1024size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3 activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear######################[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]size=1stride=1pad=1filters=75activation=linear[yolo]mask = 6,7,8anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=20num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1[route]layers = -4[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[upsample]stride=2[route]layers = -1, 61[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]size=1stride=1pad=1filters=75activation=linear[yolo]mask = 3,4,5anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=20num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1[route]layers = -4[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[upsample]stride=2[route]layers = -1, 36[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]size=1stride=1pad=1filters=75activation=linear[yolo]mask = 0,1,2anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=20num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1
7. 进到yolov3文件夹,就是你本人的我的项目根文件夹下,运行脚本
留神运行命令中的文件门路,请依据本人的理论状况批改,--epochs迭代次数你们可自行批改
python3 train.py --data ./data/voc2012.data --cfg ./cfg/yolov3.cfg --epochs 3 --weights ./weights/darknet53.conv.74
搞定!训练后果如图,我执行了三次迭代,为了示例快点有后果,另外训练好模型的权重文件存在./weights/best.pt里,如果你也是用近程服务器训练的,想在本地浏览器查看tensorboard监控面板,能够看我的另一篇博文
https://segmentfault.com/a/11...。