Skip to content

IoU API

IoU variants are pluggable similarity metrics used during detection-to-track association. You pass one of these classes to a tracker via the iou argument.

What you'll learn:

  • What each IoU variant measures and when to use it
  • How score ranges affect minimum_iou_threshold tuning
  • How to configure IoU variants in any tracker
  • How IoU variants perform across common tracking benchmarks

Install

Get started by installing trackers.

pip install trackers

For more options, see the install guide.


Quickstart

from trackers import SORTTracker
from trackers.utils.iou import IoU

tracker = SORTTracker(
    iou=IoU(),
    minimum_iou_threshold=0.3,
)

Choosing a Metric

Variant Score range When to use
IoU [0, 1] Default — strong baseline for most scenes
GIoU [-1, 1] Scenes where boxes frequently lose overlap (occlusion, re-entry)
DIoU [-1, 1] Fast-moving objects; centre-distance signal without aspect sensitivity
CIoU [-1, 1] Same as DIoU plus aspect-ratio consistency
BIoU [0, 1] Very small or very fast objects where raw boxes rarely overlap

Formula Summary (A, B boxes, C enclosing box, d center distance, c enclosing diagonal):

  • \( \mathrm{GIoU} = \mathrm{IoU} - \frac{|C \setminus (A \cup B)|}{|C|} \)
  • \( \mathrm{DIoU} = \mathrm{IoU} - \frac{d^2}{c^2 + \epsilon} \)
  • \( \mathrm{CIoU} = \mathrm{DIoU} - \alpha v \), where \( v = \frac{4}{\pi^2}\left(\arctan\frac{w_A}{h_A} - \arctan\frac{w_B}{h_B}\right)^2 \) and \( \alpha = \frac{v}{1 - \mathrm{IoU} + v + \epsilon} \)

IoU

Standard Intersection over Union — the classic baseline.

\[ \mathrm{IoU}(A, B) = \frac{|A \cap B|}{|A \cup B|} \]
IoU visualization

Scores are 0 (no overlap) to 1 (perfect overlap). Because it returns 0 whenever boxes do not intersect, the tracker gets no gradient to recover a lost track; a variant from the list below can help in those cases.

from trackers import OCSORTTracker, SORTTracker
from trackers.utils.iou import BIoU, CIoU, GIoU, IoU

# Standard IoU in SORT
sort_iou = SORTTracker(iou=IoU(), minimum_iou_threshold=0.3)

# GIoU in OC-SORT (negative thresholds are valid)
ocsort_giou = OCSORTTracker(iou=GIoU(), minimum_iou_threshold=-0.3)

# CIoU in OC-SORT
ocsort_ciou = OCSORTTracker(iou=CIoU(), minimum_iou_threshold=-0.3)

# Buffered IoU in SORT
sort_biou = SORTTracker(
    iou=BIoU(buffer_ratio=0.1),
    minimum_iou_threshold=0.3,
)

Threshold Notes

Set minimum_iou_threshold based on the score range of your chosen metric.

  • IoU and BIoU usually work with non-negative thresholds (for example, 0.2 to 0.5).
  • GIoU, DIoU, and CIoU can produce negative scores, so negative thresholds are valid.
  • Tune thresholds per dataset and tracker; there is no universal best value.

GIoU

Generalised IoU (Rezatofighi et al., 2019) — penalises the gap inside the smallest enclosing box C that neither A nor B fills.

\[ \mathrm{GIoU}(A, B) = \mathrm{IoU} - \frac{|C \setminus (A \cup B)|}{|C|} \]
GIoU visualization

When boxes do not overlap at all, IoU is flat at 0, but the penalty term still changes as boxes move closer or farther apart, giving the tracker a meaningful signal based on distances, sizes and shapes.

from trackers import OCSORTTracker
from trackers.utils.iou import GIoU

# Negative thresholds are valid and often optimal for GIoU
tracker = OCSORTTracker(iou=GIoU(), minimum_iou_threshold=-0.3)

Example — SportsMOT v_0kUtTtmLaJA_c006

HOTA (%) Δ (pts)
Best IoU 73.07
Best GIoU 89.31 +16.24

Left: IoU. Right: GIoU. Camera movements can introduce unexpected displacement, producing ID switches with IoU-based association. GIoU still provides a signal when there is no overlap by considering enclosing-box geometry, which helps preserve tracks that IoU would otherwise confuse or lose due to direction changes and non-linear motion (for example, tracks 5, 12 on the left vs 13 on the right).


DIoU

Distance IoU (Zheng et al., 2019) — adds a centre-distance penalty to IoU, normalised by the enclosing box diagonal.

\[ \mathrm{DIoU}(A, B) = \mathrm{IoU} - \frac{d^2}{c^2 + \epsilon} \]
DIoU visualization

where d is the Euclidean distance between box centres and c is the diagonal of the smallest enclosing rectangle. This encourages centre alignment independently of aspect ratio and tends to produce smoother associations in fast-motion sequences.

from trackers import OCSORTTracker
from trackers.utils.iou import DIoU

tracker = OCSORTTracker(iou=DIoU(), minimum_iou_threshold=-0.3)

Example — SportsMOT v_0kUtTtmLaJA_c006

HOTA (%) Δ (pts)
Best IoU 73.07
Best DIoU 86.53 +13.46

Left: IoU. Right: DIoU. Highly non-linear motion can make IoU drop to zero, causing the Kalman prediction to attach to another object and produce an ID switch. The centre-distance term keeps the score smoother and preserves IDs more often (for example, tracks 3–5).


CIoU

Complete IoU (Zheng et al., 2019) — extends DIoU with a penalty for aspect-ratio mismatch between the two boxes.

\[ \mathrm{CIoU}(A, B) = \mathrm{DIoU} - \alpha v \]
\[ v = \frac{4}{\pi^2}\!\left(\arctan\frac{w_A}{h_A} - \arctan\frac{w_B}{h_B}\right)^2, \quad \alpha = \frac{v}{1 - \mathrm{IoU} + v + \epsilon} \]
CIoU visualization

v measures aspect-ratio divergence; α scales it so the penalty is low when IoU is already high. On tracking benchmarks CIoU and DIoU behave similarly.

from trackers import OCSORTTracker
from trackers.utils.iou import CIoU

tracker = OCSORTTracker(iou=CIoU(), minimum_iou_threshold=-0.3)

Example — SoccerNet SNMOT-122

HOTA (%) Δ (pts)
Best IoU 77.36
Best CIoU 85.58 +8.22

Left: IoU. Right: CIoU. In this example, CIoU is capable of perfectly keeping the track of the ball, which is explained by the fact that the ball is a small and fast moving object, with roughly constant aspect ratio, where CIoU’s distance + aspect terms help more than overlap alone.


BIoU

Buffered IoU (Yang et al., 2022) — expands each box by a relative margin r before computing standard IoU. Let w = x2 − x1, h = y2 − y1:

\[ A^r = (x_1 - rw,\; y_1 - rh,\; x_2 + rw,\; y_2 + rh) \]
\[ \mathrm{BIoU}_r(A, B) = \mathrm{IoU}(A^r, B^r) \]
BIoU visualization

r = 0 recovers plain IoU exactly. Enlarging boxes creates artificial overlap for objects that are geometrically close, which is useful when detections are very small or objects move fast enough so that consecutive boxes miss each other entirely.

from trackers import SORTTracker
from trackers.utils.iou import BIoU

tracker = SORTTracker(iou=BIoU(buffer_ratio=0.15), minimum_iou_threshold=0.3)

Example — SportsMOT v_9MHDmAMxO5I_c004

HOTA (%) Δ (pts)
Best IoU 80.54
Best BIoU 88.00 +7.46

Left: IoU. Right: BIoU. Notice how ID switches happen when fast players temporarily produce non-overlapping boxes between frames. The buffer closes that gap and keeps the same ID. (e.g. tracks 7 and 8).


IoU Variant Performance Across Benchmarks

We evaluate how much each variant changes performance across datasets. For each (dataset, tracker) pair, we keep the state_estimator with the highest IoU HOTA on the evaluation split, then report mean ΔHOTA = HOTA(variant) − HOTA(IoU) over trackers (same split; thresholds tuned per experiment).

For more information on the datasets, see: dataset comparison.

Dataset IoU mean HOTA GIoU mean Δ DIoU mean Δ CIoU mean Δ BIoU mean Δ
MOT17 val 38.09 −0.09 −0.04 −0.04 −0.28
SportsMOT val 80.21 +0.65 +0.95 +0.88 +0.36
DanceTrack val 50.27 −0.80 −0.34 +0.05 +0.15
SoccerNet test 83.21 +1.57 +2.82 +2.76 +1.41

Over SportsMOT and SoccerNet, all IoU variants outperform standard IoU, with DIoU and CIoU strongest on SoccerNet and DIoU slightly ahead of CIoU on SportsMOT. In MOT17, standard IoU is best by a small margin (DIoU and CIoU are similar). On DanceTrack, GIoU and DIoU underperform IoU, while CIoU and BIoU perform slightly better.

These experiments suggest IoU variants provide task-dependent gains, with larger improvements on sports datasets. We hypothesize detection quality plays a major role: SoccerNet uses perfect detections, and SportsMOT detections come from a strong detector, and both show the largest improvements. To test this, we run an additional experiment using ground-truth boxes from MOT17 and SportsMOT as tracker detections.

Dataset (GT-as-det) IoU mean HOTA GIoU mean Δ DIoU mean Δ CIoU mean Δ BIoU mean Δ
MOT17 val 97.17 −0.05 −0.07 −0.05 +0.31
SportsMOT val 87.18 +0.47 +1.09 +1.06 +0.46

With ground-truth detections, mean ΔHOTA increases for three of four variants on SportsMOT compared to YOLOX detections. On MOT17, gaps narrow overall: GIoU moves closer to IoU, DIoU and CIoU remain slightly below IoU, and BIoU becomes positive on average. This is consistent with cleaner inputs: Kalman predictions align better with detections, so richer association signals can help more.


API Reference

BaseIoU

trackers.utils.iou.BaseIoU

Bases: ABC

Abstract base for IoU similarity metrics used in tracker association.

Subclasses implement a specific Intersection over Union variant (e.g. standard IoU, GIoU, DIoU, CIoU, BIoU) that computes a pairwise similarity matrix between two sets of bounding boxes.

The resulting matrix is used as a cost/similarity signal in the Hungarian algorithm during the data association step.

Examples:

Subclass BaseIoU to create a custom metric::

class MyIoU(BaseIoU):
    def _compute(
        self, boxes_1: np.ndarray, boxes_2: np.ndarray
    ) -> np.ndarray:
        return np.ones((len(boxes_1), len(boxes_2)))

metric = MyIoU()
matrix = metric.compute(np.array([[0, 0, 10, 10]]), np.array([[5, 5, 15, 15]]))
Note

Subclasses must override :meth:_compute, not :meth:compute. Overriding compute directly bypasses the empty-input guard which returns a zero matrix when either input has zero rows.

compute(boxes_1, boxes_2)

Compute pairwise similarity between two sets of bounding boxes.

Handles the empty-input edge case (returns a correctly-shaped zero matrix) and delegates to subclass _compute method for the actual math.

Parameters:

Name Type Description Default
boxes_1 ndarray

(N, 4) array of boxes in [x1, y1, x2, y2] format.

required
boxes_2 ndarray

(M, 4) array of boxes in [x1, y1, x2, y2] format.

required

Returns:

Type Description
ndarray

(N, M) similarity matrix where entry (i, j) is the

ndarray

similarity between boxes_1[i] and boxes_2[j].

Note

Input boxes are assumed well-formed (x1 <= x2 and y1 <= y2). No validation is performed; malformed boxes produce undefined output.

normalize_for_fusion(similarity_matrix)

Normalize similarity values for score fusion in BoT-SORT association.

By default returns the matrix unchanged. Signed variants (GIoU, DIoU, CIoU) override this to shift [-1, 1] scores into [0, 1] via (matrix + 1) / 2 so that score fusion preserves ranking for both overlapping and non-overlapping box pairs.

Parameters:

Name Type Description Default
similarity_matrix ndarray

(N, M) similarity matrix from :meth:compute.

required

Returns:

Type Description
ndarray

(N, M) matrix suitable for element-wise multiplication with

ndarray

detection confidence scores.

IoU

trackers.utils.iou.IoU

Bases: BaseIoU

Standard Intersection over Union.

Computes the ratio of the intersection area to the union area for every pair of boxes. Values range from 0 (no overlap) to 1 (perfect overlap). This is the classic metric used in SORT.

Examples:

>>> import numpy as np
>>> metric = IoU()
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 10.0]])
>>> boxes_b = np.array([[5.0, 5.0, 15.0, 15.0]])
>>> metric.compute(boxes_a, boxes_b)
array([[0.14285715]], dtype=float32)

GIoU

trackers.utils.iou.GIoU

Bases: BaseIoU

Generalized Intersection over Union (Rezatofighi et al., 2019).

Extends standard IoU by penalizing the empty area within the smallest enclosing box that is not covered by either box. This provides a meaningful gradient even when the two boxes do not overlap.

GIoU = IoU - |C \ (A U B)| / |C|

Values are in [-1, 1]: near -1 for far-apart boxes, 1 for perfect overlap.

Reference: https://arxiv.org/abs/1902.09630

Examples:

GIoU is negative for non-overlapping boxes, providing gradient signal unavailable from standard IoU::

>>> import numpy as np
>>> metric = GIoU()
>>> boxes_a = np.array([[0.0, 0.0, 1.0, 1.0]])
>>> boxes_b = np.array([[5.0, 5.0, 6.0, 6.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) < 0
True

DIoU

trackers.utils.iou.DIoU

Bases: BaseIoU

Distance Intersection over Union (Zheng et al., 2019).

Extends IoU by penalizing the normalized Euclidean distance between bounding-box centers, using the diagonal length of the smallest enclosing rectangle as the scale. This yields a smooth signal when boxes overlap or are separated and aligns with how many detectors localize objects (center-based error).

DIoU = IoU - d^2 / (c^2 + epsilon)

where d is the center-to-center distance, c is the enclosing diagonal, and \epsilon avoids division by zero (same convention as :func:torchvision.ops.distance_box_iou).

Because the penalty is nonnegative, DIoU ≤ IoU for every pair. Values typically lie in [-1, 1] for well-formed boxes.

Reference: https://arxiv.org/abs/1911.08287

Examples:

DIoU penalizes center distance, so concentric boxes score 1.0::

>>> import numpy as np
>>> metric = DIoU()
>>> boxes_a = np.array([[0.0, 0.0, 4.0, 4.0]])
>>> boxes_b = np.array([[1.0, 1.0, 3.0, 3.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) <= 1.0
True

CIoU

trackers.utils.iou.CIoU

Bases: BaseIoU

Complete Intersection over Union (Zheng et al., 2019).

Builds on DIoU by adding a penalty for mismatched aspect ratio between boxes (via a term v on the difference of box arctan aspect ratios). The trade-off is weighted by \alpha that depends on IoU and v, matching :func:torchvision.ops.complete_box_iou.

CIoU = DIoU - alpha * v, with alpha = v / (1 - IoU + v + epsilon).

So CIoU ≤ DIoU ≤ IoU when widths and heights are positive. Scores are in [-1, 1], matching the range of :func:torchvision.ops.complete_box_iou.

Reference: https://arxiv.org/abs/1911.08287

Examples:

CIoU adds an aspect-ratio penalty to DIoU, so CIoU <= DIoU::

>>> import numpy as np
>>> metric_d = DIoU()
>>> metric_c = CIoU()
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 5.0]])
>>> boxes_b = np.array([[8.0, 3.0, 15.0, 8.0]])
>>> lhs = float(metric_c.compute(boxes_a, boxes_b)[0, 0])
>>> rhs = float(metric_d.compute(boxes_a, boxes_b)[0, 0]) + 1e-6
>>> lhs <= rhs
True

BIoU

trackers.utils.iou.BIoU

Bases: BaseIoU

Buffered Intersection over Union.

Computes IoU after expanding each box by a configurable relative margin around its center:

  • x1' = x1 - r * w
  • y1' = y1 - r * h
  • x2' = x2 + r * w
  • y2' = y2 + r * h

where w = x2 - x1, h = y2 - y1, and r is buffer_ratio.

In practice, this makes association more tolerant to small localization gaps while preserving familiar IoU behavior. Setting buffer_ratio=0 recovers standard IoU exactly.

Reference: https://arxiv.org/pdf/2211.14317

Examples:

Buffer expands boxes before computing IoU — useful when detections are slightly outside the track's predicted region::

>>> import numpy as np
>>> metric = BIoU(buffer_ratio=0.1)
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 10.0]])
>>> boxes_b = np.array([[11.0, 0.0, 21.0, 10.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) > 0
True

__init__(buffer_ratio=0.1)

Initialise BIoU with a configurable buffer ratio.

Parameters:

Name Type Description Default
buffer_ratio float

Non-negative relative margin to expand each box before computing IoU. 0 recovers standard IoU exactly; larger values tolerate wider localization gaps.

0.1

Raises:

Type Description
ValueError

If buffer_ratio is negative.