mAP in object detection

Posted on 2021-10-15 In Machine Learning

IoU(Intersection over union): 两个bounding box( ground truth + proposed)之间重合部分的比例. \(IoU=\frac{A \cap B}{A \cup B}\)

TP(True positive): 对于所有ground truth检测IoU>=IoU_threshold(IoU_threshold一般取0.5)--有多少真实的bounding box被detect到(一个ground truth只考虑一次)

FP(False positive): 对于所有ground truth检测IoU<IoU_threshold(IoU_threshold一般取0.5)--每个ground truth有多少无效(IoU过低)/多余(多次检测同一个)的检测框

FN(False negative): 有多少真实的bounding box没有被detect到(一个ground truth只考虑一次)

Precision = \(\frac{TP}{TP+FP}\)--在检测为真的bounding box中有多少是有效的: \(Precision=\frac{TrueSamplesDetected}{AllDetecion}\)

Recall = \(\frac{TP}{TP+FN}\)--ground truth的bounding box有多少能被检测到: \(Recall=\frac{TrueSamplesDetected}{AllTrueSamples}\)

PR Curve: recall(x)-precision(y) curve

Suppose there are m targets(ground truth) and we propose n bounding boxes(predicted).
Mark all bounding boxed with TP/FP. If a target is detected by more than one bounding boxes, mark the bounding box with the highest IoU(>=IoU_threshold) as TP and others as FP.
Then we can derive PR curve.

Precision = accumulatedTP / (accumulatedTP+accumulatedFP)

Recall = accumulatedTP / All ground truth

image reference
AP = the area under the PR curve. (可以通过对PR曲线均匀sample得到--sample得到的点通过线性插值得到结果)

So far, we can derive AP for a single type of target. Derive AP for all types->average->mAP.