IoU API
IoU variants are pluggable similarity metrics used during detection-to-track
association. You pass one of these classes to a tracker via the iou argument.
What you'll learn:
- What each IoU variant measures and when to use it
- How score ranges affect
minimum_iou_thresholdtuning - How to configure IoU variants in any tracker
- How IoU variants perform across common tracking benchmarks
Install
Get started by installing trackers.
For more options, see the install guide.
Quickstart
from trackers import SORTTracker
from trackers.utils.iou import IoU
tracker = SORTTracker(
iou=IoU(),
minimum_iou_threshold=0.3,
)
Choosing a Metric
| Variant | Score range | When to use |
|---|---|---|
IoU |
[0, 1] |
Default — strong baseline for most scenes |
GIoU |
[-1, 1] |
Scenes where boxes frequently lose overlap (occlusion, re-entry) |
DIoU |
[-1, 1] |
Fast-moving objects; centre-distance signal without aspect sensitivity |
CIoU |
[-1, 1] |
Same as DIoU plus aspect-ratio consistency |
BIoU |
[0, 1] |
Very small or very fast objects where raw boxes rarely overlap |
Formula Summary (A, B boxes, C enclosing box, d center distance, c enclosing diagonal):
- \( \mathrm{GIoU} = \mathrm{IoU} - \frac{|C \setminus (A \cup B)|}{|C|} \)
- \( \mathrm{DIoU} = \mathrm{IoU} - \frac{d^2}{c^2 + \epsilon} \)
- \( \mathrm{CIoU} = \mathrm{DIoU} - \alpha v \), where \( v = \frac{4}{\pi^2}\left(\arctan\frac{w_A}{h_A} - \arctan\frac{w_B}{h_B}\right)^2 \) and \( \alpha = \frac{v}{1 - \mathrm{IoU} + v + \epsilon} \)
IoU
Standard Intersection over Union — the classic baseline.
Scores are 0 (no overlap) to 1 (perfect overlap). Because it returns 0 whenever
boxes do not intersect, the tracker gets no gradient to recover a lost track; a
variant from the list below can help in those cases.
from trackers import OCSORTTracker, SORTTracker
from trackers.utils.iou import BIoU, CIoU, GIoU, IoU
# Standard IoU in SORT
sort_iou = SORTTracker(iou=IoU(), minimum_iou_threshold=0.3)
# GIoU in OC-SORT (negative thresholds are valid)
ocsort_giou = OCSORTTracker(iou=GIoU(), minimum_iou_threshold=-0.3)
# CIoU in OC-SORT
ocsort_ciou = OCSORTTracker(iou=CIoU(), minimum_iou_threshold=-0.3)
# Buffered IoU in SORT
sort_biou = SORTTracker(
iou=BIoU(buffer_ratio=0.1),
minimum_iou_threshold=0.3,
)
Threshold Notes
Set minimum_iou_threshold based on the score range of your chosen metric.
IoUandBIoUusually work with non-negative thresholds (for example,0.2to0.5).GIoU,DIoU, andCIoUcan produce negative scores, so negative thresholds are valid.- Tune thresholds per dataset and tracker; there is no universal best value.
GIoU
Generalised IoU (Rezatofighi et al., 2019) — penalises the gap inside the
smallest enclosing box C that neither A nor B fills.
When boxes do not overlap at all, IoU is flat at 0, but the penalty term still
changes as boxes move closer or farther apart, giving the tracker a meaningful
signal based on distances, sizes and shapes.
from trackers import OCSORTTracker
from trackers.utils.iou import GIoU
# Negative thresholds are valid and often optimal for GIoU
tracker = OCSORTTracker(iou=GIoU(), minimum_iou_threshold=-0.3)
Example — SportsMOT v_0kUtTtmLaJA_c006
| HOTA (%) | Δ (pts) | |
|---|---|---|
| Best IoU | 73.07 | — |
| Best GIoU | 89.31 | +16.24 |
Left: IoU. Right: GIoU. Camera movements can introduce unexpected displacement,
producing ID switches with IoU-based association. GIoU still provides a signal when
there is no overlap by considering enclosing-box geometry, which helps preserve
tracks that IoU would otherwise confuse or lose due to direction changes and
non-linear motion (for example, tracks 5, 12 on the left vs 13 on the right).
DIoU
Distance IoU (Zheng et al., 2019) — adds a centre-distance penalty to IoU, normalised by the enclosing box diagonal.
where d is the Euclidean distance between box centres and c is the diagonal of
the smallest enclosing rectangle. This encourages centre alignment independently of
aspect ratio and tends to produce smoother associations in fast-motion sequences.
from trackers import OCSORTTracker
from trackers.utils.iou import DIoU
tracker = OCSORTTracker(iou=DIoU(), minimum_iou_threshold=-0.3)
Example — SportsMOT v_0kUtTtmLaJA_c006
| HOTA (%) | Δ (pts) | |
|---|---|---|
| Best IoU | 73.07 | — |
| Best DIoU | 86.53 | +13.46 |
Left: IoU. Right: DIoU. Highly non-linear motion can make IoU drop to zero,
causing the Kalman prediction to attach to another object and produce an ID switch.
The centre-distance term keeps the score smoother and preserves IDs more often
(for example, tracks 3–5).
CIoU
Complete IoU (Zheng et al., 2019) — extends DIoU with a penalty for aspect-ratio mismatch between the two boxes.
v measures aspect-ratio divergence; α scales it so the penalty is low when IoU
is already high. On tracking benchmarks CIoU and DIoU behave similarly.
from trackers import OCSORTTracker
from trackers.utils.iou import CIoU
tracker = OCSORTTracker(iou=CIoU(), minimum_iou_threshold=-0.3)
Example — SoccerNet SNMOT-122
| HOTA (%) | Δ (pts) | |
|---|---|---|
| Best IoU | 77.36 | — |
| Best CIoU | 85.58 | +8.22 |
Left: IoU. Right: CIoU. In this example, CIoU is capable of perfectly keeping the track of the ball, which is explained by the fact that the ball is a small and fast moving object, with roughly constant aspect ratio, where CIoU’s distance + aspect terms help more than overlap alone.
BIoU
Buffered IoU (Yang et al., 2022) — expands each box by a relative margin r
before computing standard IoU. Let w = x2 − x1, h = y2 − y1:
r = 0 recovers plain IoU exactly. Enlarging boxes creates artificial overlap for
objects that are geometrically close, which is useful when detections are very small
or objects move fast enough so that consecutive boxes miss each other entirely.
from trackers import SORTTracker
from trackers.utils.iou import BIoU
tracker = SORTTracker(iou=BIoU(buffer_ratio=0.15), minimum_iou_threshold=0.3)
Example — SportsMOT v_9MHDmAMxO5I_c004
| HOTA (%) | Δ (pts) | |
|---|---|---|
| Best IoU | 80.54 | — |
| Best BIoU | 88.00 | +7.46 |
Left: IoU. Right: BIoU. Notice how ID switches happen when fast players temporarily produce non-overlapping boxes between frames. The buffer closes that gap and keeps the same ID. (e.g. tracks 7 and 8).
IoU Variant Performance Across Benchmarks
We evaluate how much each variant changes performance across datasets.
For each (dataset, tracker) pair, we keep the state_estimator with the highest IoU HOTA on the evaluation split, then report mean
ΔHOTA = HOTA(variant) − HOTA(IoU) over trackers (same split; thresholds tuned per experiment).
For more information on the datasets, see: dataset comparison.
| Dataset | IoU mean HOTA | GIoU mean Δ | DIoU mean Δ | CIoU mean Δ | BIoU mean Δ |
|---|---|---|---|---|---|
| MOT17 val | 38.09 | −0.09 | −0.04 | −0.04 | −0.28 |
| SportsMOT val | 80.21 | +0.65 | +0.95 | +0.88 | +0.36 |
| DanceTrack val | 50.27 | −0.80 | −0.34 | +0.05 | +0.15 |
| SoccerNet test | 83.21 | +1.57 | +2.82 | +2.76 | +1.41 |
Over SportsMOT and SoccerNet, all IoU variants outperform standard IoU, with DIoU and CIoU strongest on SoccerNet and DIoU slightly ahead of CIoU on SportsMOT. In MOT17, standard IoU is best by a small margin (DIoU and CIoU are similar). On DanceTrack, GIoU and DIoU underperform IoU, while CIoU and BIoU perform slightly better.
These experiments suggest IoU variants provide task-dependent gains, with larger improvements on sports datasets. We hypothesize detection quality plays a major role: SoccerNet uses perfect detections, and SportsMOT detections come from a strong detector, and both show the largest improvements. To test this, we run an additional experiment using ground-truth boxes from MOT17 and SportsMOT as tracker detections.
| Dataset (GT-as-det) | IoU mean HOTA | GIoU mean Δ | DIoU mean Δ | CIoU mean Δ | BIoU mean Δ |
|---|---|---|---|---|---|
| MOT17 val | 97.17 | −0.05 | −0.07 | −0.05 | +0.31 |
| SportsMOT val | 87.18 | +0.47 | +1.09 | +1.06 | +0.46 |
With ground-truth detections, mean ΔHOTA increases for three of four variants on SportsMOT compared to YOLOX detections. On MOT17, gaps narrow overall: GIoU moves closer to IoU, DIoU and CIoU remain slightly below IoU, and BIoU becomes positive on average. This is consistent with cleaner inputs: Kalman predictions align better with detections, so richer association signals can help more.
API Reference
BaseIoU
trackers.utils.iou.BaseIoU
Bases: ABC
Abstract base for IoU similarity metrics used in tracker association.
Subclasses implement a specific Intersection over Union variant (e.g. standard IoU, GIoU, DIoU, CIoU, BIoU) that computes a pairwise similarity matrix between two sets of bounding boxes.
The resulting matrix is used as a cost/similarity signal in the Hungarian algorithm during the data association step.
Examples:
Subclass BaseIoU to create a custom metric::
class MyIoU(BaseIoU):
def _compute(
self, boxes_1: np.ndarray, boxes_2: np.ndarray
) -> np.ndarray:
return np.ones((len(boxes_1), len(boxes_2)))
metric = MyIoU()
matrix = metric.compute(np.array([[0, 0, 10, 10]]), np.array([[5, 5, 15, 15]]))
Note
Subclasses must override :meth:_compute, not :meth:compute.
Overriding compute directly bypasses the empty-input guard which
returns a zero matrix when either input has zero rows.
compute(boxes_1, boxes_2)
Compute pairwise similarity between two sets of bounding boxes.
Handles the empty-input edge case (returns a correctly-shaped zero
matrix) and delegates to subclass _compute method for the actual math.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
boxes_1
|
ndarray
|
|
required |
boxes_2
|
ndarray
|
|
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
ndarray
|
similarity between |
Note
Input boxes are assumed well-formed (x1 <= x2 and
y1 <= y2). No validation is performed; malformed boxes
produce undefined output.
normalize_for_fusion(similarity_matrix)
Normalize similarity values for score fusion in BoT-SORT association.
By default returns the matrix unchanged. Signed variants (GIoU, DIoU,
CIoU) override this to shift [-1, 1] scores into [0, 1] via
(matrix + 1) / 2 so that score fusion preserves ranking for both
overlapping and non-overlapping box pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
similarity_matrix
|
ndarray
|
|
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
ndarray
|
detection confidence scores. |
IoU
trackers.utils.iou.IoU
Bases: BaseIoU
Standard Intersection over Union.
Computes the ratio of the intersection area to the union area for every pair of boxes. Values range from 0 (no overlap) to 1 (perfect overlap). This is the classic metric used in SORT.
Examples:
>>> import numpy as np
>>> metric = IoU()
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 10.0]])
>>> boxes_b = np.array([[5.0, 5.0, 15.0, 15.0]])
>>> metric.compute(boxes_a, boxes_b)
array([[0.14285715]], dtype=float32)
GIoU
trackers.utils.iou.GIoU
Bases: BaseIoU
Generalized Intersection over Union (Rezatofighi et al., 2019).
Extends standard IoU by penalizing the empty area within the smallest enclosing box that is not covered by either box. This provides a meaningful gradient even when the two boxes do not overlap.
GIoU = IoU - |C \ (A U B)| / |C|
Values are in [-1, 1]: near -1 for far-apart boxes, 1 for perfect overlap.
Reference: https://arxiv.org/abs/1902.09630
Examples:
GIoU is negative for non-overlapping boxes, providing gradient signal unavailable from standard IoU::
>>> import numpy as np
>>> metric = GIoU()
>>> boxes_a = np.array([[0.0, 0.0, 1.0, 1.0]])
>>> boxes_b = np.array([[5.0, 5.0, 6.0, 6.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) < 0
True
DIoU
trackers.utils.iou.DIoU
Bases: BaseIoU
Distance Intersection over Union (Zheng et al., 2019).
Extends IoU by penalizing the normalized Euclidean distance between bounding-box centers, using the diagonal length of the smallest enclosing rectangle as the scale. This yields a smooth signal when boxes overlap or are separated and aligns with how many detectors localize objects (center-based error).
DIoU = IoU - d^2 / (c^2 + epsilon)
where d is the center-to-center distance, c is the enclosing
diagonal, and \epsilon avoids division by zero (same convention as
:func:torchvision.ops.distance_box_iou).
Because the penalty is nonnegative, DIoU ≤ IoU for every pair.
Values typically lie in [-1, 1] for well-formed boxes.
Reference: https://arxiv.org/abs/1911.08287
Examples:
DIoU penalizes center distance, so concentric boxes score 1.0::
>>> import numpy as np
>>> metric = DIoU()
>>> boxes_a = np.array([[0.0, 0.0, 4.0, 4.0]])
>>> boxes_b = np.array([[1.0, 1.0, 3.0, 3.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) <= 1.0
True
CIoU
trackers.utils.iou.CIoU
Bases: BaseIoU
Complete Intersection over Union (Zheng et al., 2019).
Builds on DIoU by adding a penalty for mismatched aspect ratio between
boxes (via a term v on the difference of box arctan aspect ratios).
The trade-off is weighted by \alpha that depends on IoU and v,
matching :func:torchvision.ops.complete_box_iou.
CIoU = DIoU - alpha * v, with
alpha = v / (1 - IoU + v + epsilon).
So CIoU ≤ DIoU ≤ IoU when widths and heights are positive.
Scores are in [-1, 1], matching the range of
:func:torchvision.ops.complete_box_iou.
Reference: https://arxiv.org/abs/1911.08287
Examples:
CIoU adds an aspect-ratio penalty to DIoU, so CIoU <= DIoU::
>>> import numpy as np
>>> metric_d = DIoU()
>>> metric_c = CIoU()
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 5.0]])
>>> boxes_b = np.array([[8.0, 3.0, 15.0, 8.0]])
>>> lhs = float(metric_c.compute(boxes_a, boxes_b)[0, 0])
>>> rhs = float(metric_d.compute(boxes_a, boxes_b)[0, 0]) + 1e-6
>>> lhs <= rhs
True
BIoU
trackers.utils.iou.BIoU
Bases: BaseIoU
Buffered Intersection over Union.
Computes IoU after expanding each box by a configurable relative margin around its center:
x1' = x1 - r * wy1' = y1 - r * hx2' = x2 + r * wy2' = y2 + r * h
where w = x2 - x1, h = y2 - y1, and r is buffer_ratio.
In practice, this makes association more tolerant to small localization
gaps while preserving familiar IoU behavior. Setting
buffer_ratio=0 recovers standard IoU exactly.
Reference: https://arxiv.org/pdf/2211.14317
Examples:
Buffer expands boxes before computing IoU — useful when detections are slightly outside the track's predicted region::
>>> import numpy as np
>>> metric = BIoU(buffer_ratio=0.1)
>>> boxes_a = np.array([[0.0, 0.0, 10.0, 10.0]])
>>> boxes_b = np.array([[11.0, 0.0, 21.0, 10.0]])
>>> float(metric.compute(boxes_a, boxes_b)[0, 0]) > 0
True
__init__(buffer_ratio=0.1)
Initialise BIoU with a configurable buffer ratio.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
buffer_ratio
|
float
|
Non-negative relative margin to expand each box
before computing IoU. |
0.1
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |