非极大值抑制NonMaximum-Suppression

38次阅读

共计 1140 个字符，预计需要花费 3 分钟才能阅读完成。

在模型预测阶段，我们先为图像生成多个锚框，并为这些锚框一一预测类别和偏移量。随后，我们根据锚框及其预测偏移量得到预测边界框。当锚框数量较多时，同一个目标上可能会输出较多相似的预测边界框。了使结果更加简洁，我们可以移除相似的预测边界框。常用的方法叫作非极大值抑制（non-maximum suppression，NMS）。

非极大值抑制的流程如下：

根据置信度得分进行排序
选择置信度最高的比边界框添加到最终输出列表中，将其从边界框列表中删除
计算所有边界框的面积
计算置信度最高的边界框与其它候选框的 IoU。
删除 IoU 大于阈值的边界框
重复上述过程，直至边界框列表为空。

下面的实现代码摘自《动手学深度学习（TF2.0 版）》

from collections import namedtuple
Pred_BB_Info = namedtuple("Pred_BB_Info", 
        ["index", "class_id", "confidence", "xyxy"])

def non_max_suppression(bb_info_list, nms_threshold=0.5):
    """
    非极大抑制处理预测的边界框
    Args:
        bb_info_list: Pred_BB_Info 的列表, 包含预测类别、置信度等信息
        nms_threshold: 阈值
    Returns:
        output: Pred_BB_Info 的列表, 只保留过滤后的边界框信息
    """
    output = []
    # 现根据置信度从高到底排序
    sorted_bb_info_list = sorted(bb_info_list,
                    key = lambda x: x.confidence, 
                    reverse=True)
    while len(sorted_bb_info_list) != 0:
        best = sorted_bb_info_list.pop(0)
        output.append(best)

        if len(sorted_bb_info_list) == 0:
            break
        bb_xyxy = []
        for bb in sorted_bb_info_list:
            bb_xyxy.append(bb.xyxy)

        iou = compute_jaccard(tf.convert_to_tensor(best.xyxy),
                    tf.squeeze(tf.convert_to_tensor(bb_xyxy), axis=1))[0] # shape: (len(sorted_bb_info_list), )
        n = len(sorted_bb_info_list)
        sorted_bb_info_list = [sorted_bb_info_list[i] for i in 
                    range(n) if iou[i] <= nms_threshold]
    return output

正文完