We conduct extensive experiments to empirically prove the performance of our proposed Grounded-GMOT
including both detection with GroundingDINO and association with KAM-SORT in the G2MOT problem.
"Track red colorcar while excluding
yellow, blue, black, white colorcar"
Quantitative Results
We present a detailed comparison through six tables below. The first table highlights the distinctions between our innovative Grounded-GMOT approach and the existing one-shot GMOT and Yolov8 on the G2MOT dataset. The second table shows the superior of our KAM-SORT tracker compare to other SOTA methods. Next, we evaluate the performance of our method under various settings in the third table. In the subsequent four tables, we demonstrate the effectiveness and generalization of KAM-SORT by comparing it with other state-of-the-art MOT methods on MOT20 datasets. We also conduct an ablation study on the parameter θ (theta), which measures the similarity between two vectors and α (alpha) which for uncertainty revision during tracking using KAM-SORT in fifth and sixth table, respectively.
Tracking performance comparison of multiple trackers under various settings of MOT with YOLOv8, OS-GMOT (averaged over five runs), and our proposed Grounded-GMOT on the G²MOT dataset. The best score is in bold
Trackers
Settings
HOTA ↑
MOTA ↑
IDF1 ↑
DetA ↑
AssA ↑
SORT
YOLOv8
Fully-train
5.48
-145.61
0.80
5.78
6.47
OS
Five runs of OS
24.77
7.09
24.90
30.22
20.70
Grounded-GMOT
Zero-shot
40.73
46.57
44.52
45.13
37.26
DeepSORT
YOLOv8
Fully-train
5.21
-156.20
0.74
5.88
5.82
OS
Five runs of OS
22.59
-0.20
21.66
29.30
17.89
Grounded-GMOT
Zero-shot
36.01
43.30
37.54
43.94
29.96
ByteTrack
YOLOv8
Fully-train
6.02
-140.81
0.84
5.80
7.53
OS
Five runs of OS
25.16
8.02
26.46
29.38
21.94
Grounded-GMOT
Zero-shot
39.89
45.83
45.65
43.35
37.12
OC-SORT
YOLOv8
Fully-train
5.48
-127.30
0.76
6.53
6.78
OS
Five runs of OS
25.17
12.62
25.96
29.66
21.67
Grounded-GMOT
Zero-shot
41.84
46.32
45.92
44.49
39.92
Deep OCSORT
YOLOv8
Fully-train
5.72
-145.60
0.81
5.80
6.94
OS
Five runs of OS
25.65
7.06
25.92
30.47
21.92
Grounded-GMOT
Zero-shot
40.53
46.12
43.08
46.01
36.27
MOTRv2
YOLOv8
Fully-train
3.06
0.48
0.85
0.45
20.71
OS
Five runs of OS
28.69
14.18
29.43
26.32
34.88
Grounded-GMOT
Zero-shot
42.02
41.68
45.91
41.81
42.54
Tracking performance comparison between the existing trackers and our proposed KAM-SORT tracker on G²MOT dataset. The best score is in bold.
Trackers
Settings
HOTA ↑
MOTA ↑
IDF1 ↑
DetA ↑
AssA ↑
SORT [10]
Grounded-GMOT
40.73
46.57
44.52
45.13
37.26
DeepSORT [11]
Grounded-GMOT
36.01
43.30
37.54
43.94
29.96
ByteTrack [12]
Grounded-GMOT
39.89
45.83
45.65
43.35
37.12
OC-SORT [13]
Grounded-GMOT
41.84
46.32
45.92
44.49
39.92
DeepOC-SORT [14]
Grounded-GMOT
40.53
46.12
43.08
46.01
36.27
MOTRv2 [15]
Partly-trained
42.02
41.68
45.91
41.81
42.54
KAM-SORT (Ours)
Grounded-GMOT
43.03
46.60
47.13
46.05
40.80
Tracking performance of KAM-SORT on G²MOT with various settings.
Settings
HOTA ↑
MOTA ↑
IDF1 ↑
DetA ↑
AssA ↑
attribute +
classname
42.20
43.26
45.29
44.73
40.15
definition
34.04
26.45
35.83
34.00
34.49
caption
43.03
46.60
47.13
46.05
40.80
Ablation study on the effectiveness of KAM-SORT on MOT20-testset with MOT task. As ByteTrack, OC-SORT uses different thresholds for testset sequences with an offline interpolation procedure, we also report scores by disabling these as in ByteTrack†, OC-SORT†. As Deep OC-SORT used separated weights for YOLOX, we also report scores by retraining YOLOX on MOT20-trainset as in Deep OC-SORT†.
Trackers
HOTA ↑
MOTA ↑
IDF1 ↑
MeMOT [16]
54.1
63.7
66.1
FairMOT [17]
54.6
61.8
67.3
GSDT [18]
53.6
67.1
67.5
CSTrack [19]
54.0
66.6
68.6
ByteTrack [20]
61.3
77.8
75.2
OC-SORT [21]
62.4
75.7
76.3
Deep OC-SORT [22]
63.9
75.6
79.2
ByteTrack† [20]
60.4
74.2
74.5
OC-SORT† [21]
60.5
73.1
74.4
Deep OC-SORT† [22]
59.6
75.3
75.2
KAM-SORT (Ours)
62.6
75.2
76.9
An ablation study conducted on the G²MOT dataset to demonstrate the impact of each proposed component within KAM-SORT.