Tracker Comparison
This page shows head-to-head performance of SORT, ByteTrack, OC-SORT, BoT-SORT, and C-BIoU on standard MOT benchmarks. Results are shown with default parameters and with parameter-tuned configurations found via grid search.
Benchmark version
Results use trackers v2.3.0 (released 2026-03-16). Detections are from YOLOX (MOT17, SportsMOT, DanceTrack) or ground-truth oracle boxes (SoccerNet). Parameters were tuned via grid search on held-out splits. See Methodology for details.
Benchmark methodology
Results measured using YOLOX detections (MOT17, SportsMOT, DanceTrack) or oracle ground-truth boxes (SoccerNet) with default and grid-searched parameters. Performance varies across detectors — see Detection Quality Matters for the impact of detector quality on tracking metrics.
MOT17
Pedestrian tracking with crowded scenes and frequent occlusions. Strongly tests re-identification and identity stability.
Visualization of ground-truth annotations for MOT17.
Info
Parameters were tuned on the validation set. Results are reported on the test set via Codabench submission. Detections come from a YOLOX model.
Results using default tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 58.4 | 69.9 | 67.2 |
| ByteTrack | 60.1 | 73.2 | 74.1 |
| OC-SORT | 61.9 | 76.4 | 76.0 |
| BoT-SORT | 63.7 | 78.7 | 79.2 |
| C-BIoU | 63.0 | 79.1 | 77.4 |
Results after grid search over tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 60.4 | 72.5 | 75.8 |
| ByteTrack | 60.5 | 72.7 | 76.1 |
| OC-SORT | 62.0 | 76.5 | 77.3 |
| BoT-SORT | 63.8 | 78.7 | 79.4 |
| C-BIoU | 63.0 | 79.1 | 77.4 |
Tuned configuration for each tracker.
SORT:
lost_track_buffer: 10
track_activation_threshold: 0.75
minimum_consecutive_frames: 2
minimum_iou_threshold: 0.3
ByteTrack:
lost_track_buffer: 10
track_activation_threshold: 0.7
minimum_consecutive_frames: 1
minimum_iou_threshold: 0.3
high_conf_det_threshold: 0.5
OC-SORT:
lost_track_buffer: 30
minimum_iou_threshold: 0.3
minimum_consecutive_frames: 3
direction_consistency_weight: 0.2
high_conf_det_threshold: 0.4
delta_t: 1
BoT-SORT:
lost_track_buffer: 30
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.2
minimum_iou_threshold_second_assoc: 0.5
minimum_iou_threshold_unconfirmed_assoc: 0.2
high_conf_det_threshold: 0.5
track_activation_threshold: 0.6
enable_cmc: true
cmc_method: sparseOptFlow
C-BIoU:
lost_track_buffer: 30
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.2
minimum_iou_threshold_second_assoc: 0.5
minimum_iou_threshold_unconfirmed_assoc: 0.3
high_conf_det_threshold: 0.6
track_activation_threshold: 0.7
buffer_ratio_first: 0.3
buffer_ratio_second: 0.5
SportsMOT
Sports broadcast tracking with fast motion, camera pans, and similar-looking targets. Tests association under speed and appearance ambiguity.
Visualization of ground-truth annotations for SportsMOT.
Info
Parameters were tuned on the validation set. Results are reported on the test set via Codabench submission. Detections come from a YOLOX model.
Results using default tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 70.8 | 68.9 | 95.5 |
| ByteTrack | 73.0 | 72.5 | 96.4 |
| OC-SORT | 71.7 | 71.4 | 95.0 |
| BoT-SORT | 73.8 | 73.4 | 96.9 |
| C-BIoU | 73.1 | 72.6 | 96.7 |
Results after grid search over tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 72.9 | 73.0 | 95.8 |
| ByteTrack | 73.3 | 73.5 | 95.9 |
| OC-SORT | 74.0 | 75.4 | 95.6 |
| BoT-SORT | 74.1 | 74.0 | 96.9 |
| C-BIoU | 73.1 | 72.6 | 96.7 |
Tuned configuration for each tracker.
SORT:
lost_track_buffer: 60
track_activation_threshold: 0.9
minimum_consecutive_frames: 2
minimum_iou_threshold: 0.05
ByteTrack:
lost_track_buffer: 10
track_activation_threshold: 0.9
minimum_consecutive_frames: 1
minimum_iou_threshold: 0.05
high_conf_det_threshold: 0.7
OC-SORT:
lost_track_buffer: 60
minimum_iou_threshold: 0.1
minimum_consecutive_frames: 3
direction_consistency_weight: 0.2
high_conf_det_threshold: 0.6
delta_t: 3
BoT-SORT:
lost_track_buffer: 30
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.1
minimum_iou_threshold_second_assoc: 0.5
minimum_iou_threshold_unconfirmed_assoc: 0.3
high_conf_det_threshold: 0.7
track_activation_threshold: 0.8
enable_cmc: true
cmc_method: sparseOptFlow
C-BIoU:
lost_track_buffer: 30
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.2
minimum_iou_threshold_second_assoc: 0.5
minimum_iou_threshold_unconfirmed_assoc: 0.3
high_conf_det_threshold: 0.6
track_activation_threshold: 0.7
buffer_ratio_first: 0.3
buffer_ratio_second: 0.5
SoccerNet-tracking
Long sequences with dense interactions and partial occlusions. Tests long-term ID consistency.
Visualization of ground-truth annotations for SoccerNet.
Info
Parameters were tuned on the train set. Results are reported on the test set. SoccerNet-tracking has no validation split. This dataset provides oracle (ground-truth) detections.
Results using default tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 81.6 | 76.2 | 95.1 |
| ByteTrack | 84.0 | 78.1 | 97.8 |
| OC-SORT | 78.4 | 72.6 | 94.1 |
| BoT-SORT | 84.5 | 79.3 | 96.6 |
| C-BIoU | 82.6 | 76.6 | 97.0 |
Results after grid search over tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 84.2 | 78.2 | 98.2 |
| ByteTrack | 84.0 | 78.1 | 98.2 |
| OC-SORT | 82.9 | 77.9 | 96.8 |
| BoT-SORT | 85.0 | 79.7 | 97.2 |
| C-BIoU | 85.7 | 80.0 | 99.3 |
Tuned configuration for each tracker.
SORT:
lost_track_buffer: 30
track_activation_threshold: 0.25
minimum_consecutive_frames: 2
minimum_iou_threshold: 0.05
ByteTrack:
lost_track_buffer: 30
track_activation_threshold: 0.2
minimum_consecutive_frames: 1
minimum_iou_threshold: 0.05
high_conf_det_threshold: 0.5
OC-SORT:
lost_track_buffer: 60
minimum_iou_threshold: 0.1
minimum_consecutive_frames: 3
direction_consistency_weight: 0.2
high_conf_det_threshold: 0.4
delta_t: 1
BoT-SORT:
lost_track_buffer: 60
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.1
minimum_iou_threshold_second_assoc: 0.6
minimum_iou_threshold_unconfirmed_assoc: 0.2
high_conf_det_threshold: 0.6
track_activation_threshold: 0.7
enable_cmc: true
cmc_method: sparseOptFlow
C-BIoU:
lost_track_buffer: 43
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.05
minimum_iou_threshold_second_assoc: 0.46
minimum_iou_threshold_unconfirmed_assoc: 0.27
high_conf_det_threshold: 0.40
track_activation_threshold: 0.48
buffer_ratio_first: 0.68
buffer_ratio_second: 0.50
SoccerNet buffer ordering exception
This config uses buffer_ratio_first: 0.68 > buffer_ratio_second: 0.50, which reverses
the general b1 < b2 recommendation in the C-BIoU docs.
Optuna found this ordering yields higher HOTA on SoccerNet's dense, long-sequence scenarios.
On most other datasets the b1 < b2 default applies.
DanceTrack
Group dancing tracking with uniform appearance, diverse motions, and extreme articulation. Tests motion-based association without relying on visual discrimination.
Visualization of ground-truth annotations for DanceTrack.
Info
Parameters were tuned on the validation set. Results are reported on the test set via Codabench submission. Detections come from a YOLOX model.
Results using default tracker parameters.
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 47.2 | 41.0 | 86.5 |
| ByteTrack | 53.3 | 53.6 | 90.3 |
| OC-SORT | 54.1 | 53.3 | 89.3 |
| BoT-SORT | 57.8 | 57.9 | 92.2 |
| C-BIoU | 56.7 | 56.7 | 92.2 |
Hyperparameter tuning, reporting the best tuned configuration per tracker evaluated on the test set (tuning performed on the valid split; if tuning did not outperform registry defaults, defaults are shown).
| Tracker | HOTA | IDF1 | MOTA |
|---|---|---|---|
| SORT | 54.3 | 53.4 | 89.5 |
| ByteTrack | 55.3 | 55.2 | 89.9 |
| OC-SORT | 54.1 | 53.3 | 89.3 |
| BoT-SORT | 57.8 | 57.9 | 92.2 |
| C-BIoU | 57.7 | 58.7 | 92.4 |
Best configuration for each tracker.
SORT:
lost_track_buffer: 91
track_activation_threshold: 0.89
minimum_consecutive_frames: 3
minimum_iou_threshold: 0.21
ByteTrack:
lost_track_buffer: 76
track_activation_threshold: 0.9
minimum_consecutive_frames: 4
minimum_iou_threshold: 0.33
high_conf_det_threshold: 0.52
OC-SORT:
lost_track_buffer: 30
minimum_iou_threshold: 0.3
minimum_consecutive_frames: 3
direction_consistency_weight: 0.2
high_conf_det_threshold: 0.6
delta_t: 3
BoT-SORT:
lost_track_buffer: 30
minimum_consecutive_frames: 2
minimum_iou_threshold_first_assoc: 0.2
minimum_iou_threshold_second_assoc: 0.5
minimum_iou_threshold_unconfirmed_assoc: 0.3
high_conf_det_threshold: 0.6
track_activation_threshold: 0.7
enable_cmc: true
cmc_method: sparseOptFlow
C-BIoU:
lost_track_buffer: 37
track_activation_threshold: 0.71
minimum_consecutive_frames: 3
minimum_iou_threshold_first_assoc: 0.22
minimum_iou_threshold_second_assoc: 0.70
minimum_iou_threshold_unconfirmed_assoc: 0.26
high_conf_det_threshold: 0.34
buffer_ratio_first: 0.12
buffer_ratio_second: 0.10
DanceTrack buffer ordering exception
This config uses buffer_ratio_first: 0.12 > buffer_ratio_second: 0.10, which reverses
the general b1 < b2 recommendation in the C-BIoU docs.
Optuna found this ordering on DanceTrack's validation split; the margin (0.02) is small
and the b1 < b2 default applies on most other datasets.
Methodology
Detections
Each dataset uses one of two detection sources: oracle detections (ground-truth bounding boxes provided by the dataset) or model detections (produced by a YOLOX detector following the ByteTrack procedure). The source is noted per dataset above.
Tuning
Best parameters per tracker and dataset were found via grid search (SORT, ByteTrack,
OC-SORT, BoT-SORT) or Optuna (n_trials=100, objective HOTA, trial 0 = defaults for
C-BIoU), selecting the configuration with the highest HOTA on the tune split. Tuning and
evaluation always use separate data splits to reflect real-world usage:
- Train + validation + test: tune on validation, report on test.
- Train + validation: tune on train, report on validation.
- Train + test: tune on train, report on test.
When to Use Each Tracker
SORT is the right choice when speed is the primary constraint and scenes are not heavily occluded. Its Kalman filter plus Hungarian matching runs at hundreds of frames per second and produces clean, easy-to-debug results. Use SORT as a baseline before adding more complex trackers, or when deploying on edge devices with tight compute budgets.
ByteTrack is the default recommendation for most applications. It outperforms SORT on all four benchmarks by recovering low-confidence detections that SORT discards. The two-stage association adds almost no extra compute and consistently reduces missed tracks and identity switches. Use ByteTrack when your detector produces noisy or variable-confidence outputs — sports video, aerial footage, and crowded retail scenes all benefit.
OC-SORT is best when camera motion is significant or objects follow non-linear paths. Its observation-centric re-update mechanism and direction consistency cost reduce drift from the linear motion assumption. Use OC-SORT when SORT or ByteTrack loses tracks on fast turns, camera pans, or erratic motion — the benchmark edge on MOT17 reflects exactly these conditions.
BoT-SORT is the choice when camera ego-motion is strong and you need the most stable identities. It extends ByteTrack with camera motion compensation (CMC) and confidence-aware association, which reduces ID switches on panning or handheld footage. Use BoT-SORT for sports broadcasts, drone video, or any scene where the camera moves frequently. The CMC overhead is small relative to the detector, so the trade-off favors identity stability over raw speed.
C-BIoU targets fast or irregular motion when you want buffered, cascaded geometric matching without camera motion compensation. In these benchmarks it leads on SoccerNet, reaches the highest tuned IDF1 and MOTA on DanceTrack, and achieves the highest IDF1 on MOT17 among the trackers listed here. Use C-BIoU when BoT-SORT-style association is a good fit but CMC is unavailable or harmful, or when plain IoU matching is too strict. See C-BIoU for buffer scales b1 and b2.
Metric Definitions
HOTA (Higher Order Tracking Accuracy) — the primary benchmark metric. HOTA decomposes tracking quality into detection accuracy (DetA) and association accuracy (AssA), then takes their geometric mean. It weights identity consistency equally with detection recall and precision, unlike older metrics that under-penalize fragmented tracks. Higher HOTA indicates both good detection and stable long-term identity.
IDF1 (Identity F1) — measures how long the system correctly identifies each ground-truth object over its lifetime. IDF1 is the harmonic mean of identification precision and identification recall. High IDF1 means tracks stay on the correct identity; low IDF1 means frequent identity switches.
MOTA (Multiple Object Tracking Accuracy) — combines the count of false positives, missed detections, and identity switches into a single score relative to the total number of ground-truth objects. MOTA is dominated by detection recall and precision; a detector with near-perfect recall produces high MOTA even when identity switches are frequent.