Skip to content

Trackers API

SORT

trackers.core.sort.tracker.SORTTracker

Bases: BaseTracker

In SORT, object tracking begins with high-confidence detections fed into a Kalman filter framework assuming uniform motion for state prediction across frames. Association occurs via IoU-based costs in the Hungarian algorithm, enforcing a threshold to filter weak matches and initialize new identities. Tracks persist only with consistent associations, terminating quickly to avoid erroneous propagation. This detection-driven approach underscores the importance of upstream detector performance in achieving competitive multi-object tracking results. Over time, SORT has become a cornerstone for evaluating motion-based improvements in the field.

SORT's standout strength is its real-time capability, processing hundreds of frames per second while maintaining accuracy comparable to more complex offline methods. It performs well in controlled environments with reliable detections, minimizing computational demands. However, without mechanisms for re-identification, it incurs frequent identity switches during object reappearances post-occlusion. The linear motion assumption limits effectiveness in non-linear paths, such as those in sports or wildlife tracking. Ultimately, SORT's efficiency is offset by its sensitivity to environmental complexities, necessitating hybrid extensions for broader applicability.

Parameters:

Name Type Description Default
lost_track_buffer int

int specifying number of frames to buffer when a track is lost. Increasing this value enhances occlusion handling but may increase ID switching for similar objects.

30
frame_rate float

float specifying video frame rate in frames per second. Used to scale the lost track buffer for consistent tracking across different frame rates.

30.0
track_activation_threshold float

float specifying minimum detection confidence to create new tracks. Higher values reduce false positives but may miss low-confidence objects.

0.25
minimum_consecutive_frames int

int specifying number of consecutive frames before a track is considered valid. Before reaching this threshold, tracks are assigned tracker_id of -1.

3
minimum_iou_threshold float

float specifying IoU threshold for associating detections to existing tracks. Higher values require more overlap.

0.3

update(detections)

Update tracker state with new detections and return tracked objects. Performs Kalman filter prediction, IoU-based association, and initializes new tracks for unmatched high-confidence detections.

Parameters:

Name Type Description Default
detections Detections

sv.Detections containing bounding boxes with shape (N, 4) in (x_min, y_min, x_max, y_max) format and optional confidence scores.

required

Returns:

Type Description
Detections

sv.Detections with tracker_id assigned for each detection. Unmatched or immature tracks have tracker_id of -1.

reset()

Reset tracker state by clearing all tracks and resetting ID counter. Call this method when switching to a new video or scene.

ByteTrack

trackers.core.bytetrack.tracker.ByteTrackTracker

Bases: BaseTracker

ByteTrack operates online by processing all detector outputs, categorizing them by confidence thresholds to enable a two-stage association process. High-score boxes are initially linked to tracklets via Kalman filter predictions and IoU-based Hungarian matching, optionally enhanced with appearance features. Low-score boxes follow in a secondary matching phase using pure motion similarity to revive occluded tracks. Tracks without matches are kept briefly for potential re-association, preventing premature termination. This inclusive approach addresses common pitfalls in detection filtering, establishing ByteTrack as a flexible enhancer for existing tracking frameworks.

ByteTrack excels in dense environments, where its low-score recovery mechanism minimizes missed detections and enhances overall trajectory completeness. It consistently improves performance across diverse datasets, demonstrating robustness and generalization. The tracker's speed remains competitive, facilitating integration into production pipelines. On the downside, it is highly dependent on detector quality, with performance drops in noisy or low-resolution inputs. Additionally, the motion-only secondary association may lead to erroneous matches in scenes with similar moving objects.

Parameters:

Name Type Description Default
lost_track_buffer int

int specifying number of frames to buffer when a track is lost. Increasing this value enhances occlusion handling but may increase ID switching for disappearing objects.

30
frame_rate float

float specifying video frame rate in frames per second. Used to scale the lost track buffer for consistent tracking across different frame rates.

30.0
track_activation_threshold float

float specifying minimum detection confidence to create new tracks. Higher values reduce false positives but may miss low-confidence objects.

0.7
minimum_consecutive_frames int

int specifying number of consecutive frames before a track is considered valid. Before reaching this threshold, tracks are assigned tracker_id of -1.

2
minimum_iou_threshold float

float specifying IoU threshold for associating detections to existing tracks. Higher values require more overlap.

0.1
high_conf_det_threshold float

float specifying threshold for separating high and low confidence detections in the two-stage association.

0.6

update(detections)

Update tracker state with new detections and return tracked objects. Performs Kalman filter prediction, two-stage association (high then low confidence), and initializes new tracks for unmatched detections.

Parameters:

Name Type Description Default
detections Detections

sv.Detections containing bounding boxes with shape (N, 4) in (x_min, y_min, x_max, y_max) format and optional confidence scores.

required

Returns:

Type Description
Detections

sv.Detections with tracker_id assigned for each detection. Unmatched detections have tracker_id of -1. Detection order may differ from input.

reset()

Reset tracker state by clearing all tracks and resetting ID counter. Call this method when switching to a new video or scene.

OC-SORT

trackers.core.ocsort.tracker.OCSORTTracker

Bases: BaseTracker

OC-SORT enhances traditional SORT by shifting to an observation-centric paradigm, using detections to correct Kalman filter errors accumulated during occlusions. It introduces Observation-Centric Re-Update to generate virtual trajectories for parameter refinement upon track reactivation. Association incorporates Observation-Centric Momentum, blending IoU with direction consistency from historical observations. Short-term recoveries are aided by heuristics linking unmatched tracks to prior detections. This rethinking prioritizes real measurements over estimations, making OC-SORT particularly adept at handling real-world tracking challenges.

OC-SORT's primary strength is its robustness to non-linear motions and occlusions, outperforming baselines on datasets with erratic movements like DanceTrack. It maintains extreme efficiency, processing over 700 frames per second on CPUs for scalable deployments. The tracker excels in crowded scenes, reducing identity switches through momentum-based associations. However, lacking appearance features, it can confuse similar objects in overlapping paths. Its linear motion core still imposes limits in extreme velocity variations, requiring careful parameter selection.

Parameters:

Name Type Description Default
lost_track_buffer int

int specifying number of frames to buffer when a track is lost. Increasing this value enhances occlusion handling but may increase ID switching for similar objects.

30
frame_rate float

float specifying video frame rate in frames per second. Used to scale the lost track buffer for consistent tracking across different frame rates.

30.0
minimum_consecutive_frames int

int specifying number of consecutive frames before a track is considered valid. Before reaching this threshold, tracks are assigned tracker_id of -1.

3
minimum_iou_threshold float

float specifying IoU threshold for associating detections to existing tracks. Higher values require more overlap.

0.3
direction_consistency_weight float

float specifying weight for direction consistency in the association cost. Higher values prioritize angle alignment between motion and association direction.

0.2
high_conf_det_threshold float

float specifying threshold for high confidence detections. Lower confidence detections are excluded from association.

0.6
delta_t int

int specifying number of past frames to use for velocity estimation. Higher values provide more stable direction estimates during occlusion.

3

update(detections)

Update tracker state with new detections and return tracked objects. Performs Kalman filter prediction, two-stage association using direction consistency and last-observation recovery, and initializes new tracks for unmatched high-confidence detections.

Parameters:

Name Type Description Default
detections Detections

sv.Detections containing bounding boxes with shape (N, 4) in (x_min, y_min, x_max, y_max) format and optional confidence scores.

required

Returns:

Type Description
Detections

sv.Detections with tracker_id assigned for each detection. Unmatched or immature tracks have tracker_id of -1.

reset()

Reset tracker state by clearing all tracks and resetting ID counter. Call this method when switching to a new video or scene.

Utilities

trackers.utils.converters.xyxy_to_xcycsr(xyxy)

Convert bounding boxes from corner to center-scale-ratio format.

Parameters:

Name Type Description Default
xyxy ndarray

Bounding boxes [x_min, y_min, x_max, y_max] with shape (4,) for a single box or (N, 4) for multiple boxes.

required

Returns:

Type Description
ndarray

Bounding boxes [x_center, y_center, scale, aspect_ratio] with same shape as input, where scale is area (width * height) and aspect_ratio is width / height.

Examples:

>>> import numpy as np
>>> from trackers import xyxy_to_xcycsr
>>>
>>> boxes = np.array([
...     [0,   0, 10, 10],
...     [0,   0, 20, 10],
...     [0,   0, 10, 20],
... ])
>>>
>>> xyxy_to_xcycsr(boxes)
array([[  5.        ,   5.        , 100.        ,   0.9999999 ],
       [ 10.        ,   5.        , 200.        ,   1.9999998 ],
       [  5.        ,  10.        , 200.        ,   0.49999998]])

trackers.utils.converters.xcycsr_to_xyxy(xcycsr)

Convert bounding boxes from center-scale-ratio to corner format.

Parameters:

Name Type Description Default
xcycsr ndarray

Bounding boxes [x_center, y_center, scale, aspect_ratio] with shape (4,) for a single box or (N, 4) for multiple boxes, where scale is area and aspect_ratio is width / height.

required

Returns:

Type Description
ndarray

Bounding boxes [x_min, y_min, x_max, y_max] with same shape as input.

Examples:

>>> import numpy as np
>>> from trackers import xcycsr_to_xyxy
>>>
>>> boxes = np.array([
...     [  5.,   5., 100., 1.],
...     [ 10.,   5., 200., 2.],
...     [  5.,  10., 200., 0.5],
... ])
>>>
>>> xcycsr_to_xyxy(boxes)
array([[ 0.,  0., 10., 10.],
       [ 0.,  0., 20., 10.],
       [ 0.,  0., 10., 20.]])