SORT

Overview

SORT (Simple Online and Realtime Tracking) is a lean, tracking-by-detection method that combines a Kalman filter for motion prediction with the Hungarian algorithm for data association. It uses object detections—commonly from a high-performing CNN-based detector—as its input, updating each tracked object’s bounding box based on linear velocity estimates. Because SORT relies on minimal appearance modeling (only bounding box geometry is used), it is extremely fast and can run comfortably at hundreds of frames per second. This speed and simplicity make it well suited for real-time applications in robotics or surveillance, where rapid, approximate solutions are essential. However, its reliance on frame-to-frame matching makes SORT susceptible to ID switches and less robust during long occlusions, since there is no built-in re-identification module.

Examples

inferencerf-detrultralyticstransformers

import supervision as sv
from trackers import SORTTracker
from inference import get_model

tracker = SORTTracker()
model = get_model(model_id="yolov11m-640")
annotator = sv.LabelAnnotator(text_position=sv.Position.CENTER)

def callback(frame, _):
    result = model.infer(frame)[0]
    detections = sv.Detections.from_inference(result)
    detections = tracker.update(detections)
    return annotator.annotate(frame, detections, labels=detections.tracker_id)

sv.process_video(
    source_path="<INPUT_VIDEO_PATH>",
    target_path="<OUTPUT_VIDEO_PATH>",
    callback=callback,
)

import supervision as sv
from trackers import SORTTracker
from rfdetr import RFDETRBase

tracker = SORTTracker()
model = RFDETRBase()
annotator = sv.LabelAnnotator(text_position=sv.Position.CENTER)

def callback(frame, _):
    detections = model.predict(frame)
    detections = tracker.update(detections)
    return annotator.annotate(frame, detections, labels=detections.tracker_id)

sv.process_video(
    source_path="<INPUT_VIDEO_PATH>",
    target_path="<OUTPUT_VIDEO_PATH>",
    callback=callback,
)

import supervision as sv
from trackers import SORTTracker
from ultralytics import YOLO

tracker = SORTTracker()
model = YOLO("yolo11m.pt")
annotator = sv.LabelAnnotator(text_position=sv.Position.CENTER)

def callback(frame, _):
    result = model(frame)[0]
    detections = sv.Detections.from_ultralytics(result)
    detections = tracker.update(detections)
    return annotator.annotate(frame, detections, labels=detections.tracker_id)

sv.process_video(
    source_path="<INPUT_VIDEO_PATH>",
    target_path="<OUTPUT_VIDEO_PATH>",
    callback=callback,
)

import torch
import supervision as sv
from trackers import SORTTracker
from transformers import RTDetrV2ForObjectDetection, RTDetrImageProcessor

tracker = SORTTracker()
processor = RTDetrImageProcessor.from_pretrained("PekingU/rtdetr_v2_r18vd")
model = RTDetrV2ForObjectDetection.from_pretrained("PekingU/rtdetr_v2_r18vd")
annotator = sv.LabelAnnotator(text_position=sv.Position.CENTER)

def callback(frame, _):
    inputs = processor(images=frame, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**inputs)

    h, w, _ = frame.shape
    results = processor.post_process_object_detection(
        outputs,
        target_sizes=torch.tensor([(h, w)]),
        threshold=0.5
    )[0]

    detections = sv.Detections.from_transformers(
        transformers_results=results,
        id2label=model.config.id2label
    )

    detections = tracker.update(detections)
    return annotator.annotate(frame, detections, labels=detections.tracker_id)

sv.process_video(
    source_path="<INPUT_VIDEO_PATH>",
    target_path="<OUTPUT_VIDEO_PATH>",
    callback=callback,
)

API

`trackers.core.sort.tracker.SORTTracker`

Bases: BaseTracker

Implements SORT (Simple Online and Realtime Tracking).

SORT is a pragmatic approach to multiple object tracking with a focus on simplicity and speed. It uses a Kalman filter for motion prediction and the Hungarian algorithm or simple IOU matching for data association.

Parameters:

Name	Type	Description	Default
`lost_track_buffer`	`int`	Number of frames to buffer when a track is lost. Increasing lost_track_buffer enhances occlusion handling, significantly improving tracking through occlusions, but may increase the possibility of ID switching for objects with similar appearance.	`30`
`frame_rate`	`float`	Frame rate of the video (frames per second). Used to calculate the maximum time a track can be lost.	`30.0`
`track_activation_threshold`	`float`	Detection confidence threshold for track activation. Only detections with confidence above this threshold will create new tracks. Increasing this threshold reduces false positives but may miss real objects with low confidence.	`0.25`
`minimum_consecutive_frames`	`int`	Number of consecutive frames that an object must be tracked before it is considered a 'valid' track. Increasing `minimum_consecutive_frames` prevents the creation of accidental tracks from false detection or double detection, but risks missing shorter tracks. Before the tracker is considered valid, it will be assigned `-1` as its `tracker_id`.	`3`
`minimum_iou_threshold`	`float`	IOU threshold for associating detections to existing tracks.	`0.3`

`update(detections)`

Updates the tracker state with new detections.

Performs Kalman filter prediction, associates detections with existing trackers based on IOU, updates matched trackers, and initializes new trackers for unmatched high-confidence detections.

Parameters:

Name	Type	Description	Default
`detections`	`Detections`	The latest set of object detections from a frame.	required

Returns:

Type	Description
`Detections`	sv.Detections: A copy of the input detections, augmented with assigned `tracker_id` for each successfully tracked object. Detections not associated with a track will not have a `tracker_id`.

`reset()`

Resets the tracker's internal state.

Clears all active tracks and resets the track ID counter.

SORT

Overview

Examples

API

trackers.core.sort.tracker.SORTTracker

update(detections)

reset()

Comments

`trackers.core.sort.tracker.SORTTracker`

`update(detections)`

`reset()`