Datasets | Task | NLP | #Cat. | #Vid. | #Frames | #Tracks | #Boxs | Obj. | App. | Den. | Occ. | Mot. |
---|---|---|---|---|---|---|---|---|---|---|---|---|
OTB2013 | SOT | ✗ | 10 | 51 | 29K | 51 | 29K | -- | -- | -- | -- | -- |
VOT2017 | SOT | ✗ | 24 | 60 | 21K | 60 | 21K | -- | -- | -- | -- | -- |
TrackingNet | SOT | ✗ | 21 | 31K | 14M | 31K | 14M | -- | -- | -- | -- | -- |
MOT17 | MOT | ✗ | 1 | 14 | 11.2K | 1.3K | 0.3M | 39(35) | 62(10) | 3.85(1.50) | 14(16) | 94(11) |
MOT20 | MOT | ✗ | 1 | 8 | 13.41K | 3.45K | 1.65M | 150(70) | 68(8) | 6.42(1.20) | 15(15) | 96(4) |
Omni-MOT | MOT | ✗ | 1 | -- | 14M+ | 250K | 110M | -- | -- | -- | -- | -- |
DanceTrack | MOT | ✗ | 1 | 100 | 105K | 990 | -- | 9(5) | 77(7) | 2.67(0.99) | 21(17) | 90(9) |
TAO | MOT | ✗ | 833 | 2.9K | 2.6M | 17.2K | 333K | 3(2) | 69(7) | 1.82(0.76) | 11(14) | 49(34) |
SportsMOT | MOT | ✗ | 1 | 240 | 150K | 3.4K | 1.62M | 11(3) | 73(8) | 2.44(0.80) | 18(17) | 80(16) |
GOT-10k | GSOT | ✗ | 563 | 10K | 1.5M | 10K | 1.5M | -- | -- | -- | -- | -- |
Fish | GSOT | ✗ | 1 | 1.6K | 527.2K | 8.25K | 516K | -- | -- | -- | -- | -- |
AnimalTrack | GMOT | ✗ | 10 | 58 | 24.7K | 1.92K | 429K | 17(9) | 72(8) | 3.13(1.22) | 15(15) | 91(11) |
GMOT-40 | GMOT | ✗ | 10 | 40 | 9K | 2.02K | 256K | 24(17) | 71(9) | 2.56(0.88) | 11(12) | 43(44) |
LaSOT | SOT | coarse | 70 | 1.4K | 3.52M | 1.4K | 3.52M | -- | -- | -- | -- | -- |
TNL2K | SOT | coarse | -- | 2K | 1.24M | 2K | 1.24M | -- | -- | -- | -- | -- |
Refer-KITTI | MOT | coarse | 2 | 18 | 6.65K | 637 | 28.72K | 5(4) | 65(6) | 1.78(0.74) | 11(11) | 73(21) |
G2MOT (Ours) | GMOT | fine | 20 | 253 | 157.2K | 5.84K | 1.87M | 12(5) | 74(8) | 2.65(0.95) | 18(16) | 84(14) |
Combining datasets in object tracking offers strategic advantages. First, individual tracking datasets focus on specific challenges. Second, merging tracking datasets yields diverse challenges requiring tracking models to efficiently in varied scenarios. Therefore, by combining datasets, we can evaluate the tracking models' ability to deal with diverse scenarios e.g. object movements, density, similar appearance, and occlusion which are in line with the goal of the GMOT challenge. Finally, our ultimate objective is to propose a new paradigm for GMOT and create a challenging benchmark dataset under various demanding real-world scenarios.
Each video in these datasets has been carefully annotated with several details:
For text label:
For track label:
each line will contain 9 elements, seperated by commas
<frame>, <id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, <conf>, <x>, <y>
The annotations are formatted in JSON, and we provide examples to illustrate how they are structured. This data, prepared by 4 annotators, will be shared publicly.
Text label for referring with specific attributes { id: "", video_id: "", is_eval: "", type: "", superset_idx: "", class_name: "", synonyms:[], definition: "", attributes: [] track_path: "", caption: "", }
Track label for associating objects' IDs through time 1, 1, xl, yt, w, h, 1, 1, 1 1, 2, xl, yt, w, h, 1, 1, 1 1, 3, xl, yt, w, h, 1, 1, 1 2, 1, xl, yt, w, h, 1, 1, 1 2, 2, xl, yt, w, h, 1, 1, 1 2, 3, xl, yt, w, h, 1, 1, 1 3, 1, xl, yt, w, h, 1, 1, 1 3, 2, xl, yt, w, h, 1, 1, 1 3, 3, xl, yt, w, h, 1, 1, 1
video: "airplane-1", label:{ class_name: "helicopter", class_synonyms:["airplane", "aircraft", "jet", "plane"], definition: "a vehicle designed for flight in the air", include_attributes: ["black", "flying"], exclude_attributes: [], caption: "Track all black flying helicopters", track_path: "airplane_01.txt" }
video: "car-1" label:{ class_name: "car", class_synonyms: ["vehicle", "automobile", "auto", "transport", "transportation"], definition: "mechanical device designed for transportation, powered by an engine or motor, equipped by four wheels", include_attributes: ["white headlight", "oncoming traffic"], exclude_attributes: ["red taillight", "opposite traffic"], caption: "Track white headlight cars while excluding red taillight cars", track_path: "car_01.txt", }