![]() To be more specific, the larger an affinity matrix, the longer time it will take to perform the DN search. Among the steps in the algorithm, the step of DN search is the most time-consuming. H \(^2\)T is slow in handling the long tracking sequence where there exist plenty of targets. \(\textit\) are set as 0.5.Īlgorithm Efficiency. The softmax loss guarantees the discriminative ability of the appearance feature, while the triplet loss ensures the cosine distance of the appearance features of the same identity to be small. We use the softmax and triplet loss jointly during training. Such a dataset consists of multiple person re-id datasets, including PRW, Market-1501, VIPeR and CUHK03. The cosine distance is used for measuring the appearance affinity.įor training, we collect a dataset which contains nearly 119 K patches from 19835 identities. In the tracking phase, patches are first cropped according to the detection responses, and then resized to \(96 \times 96\) for feature extraction. The output layer is a fully connected layer which outputs the 128 dimensional feature. The input size of our network is \(96 \times 96\), and the kernel size of pool5 layer is \(3 \times 3\) instead of \(7 \times 7\). In our implementation, we extract the appearance feature using a network which is similar to GoogLeNet. ![]() The affinity value based on the ideal appearance feature should be large for persons of the same identity, and be small for persons of different identities. ![]() The distance between appearance features is used for computing the affinity value in data association. This process is experimental and the keywords may be updated as the learning algorithm improves. These keywords were added by machine and not by the authors. In the following part, we first summarize the detection and appearance feature, and then introduce our tracker named Person of Interest (POI), which has both online and offline version (We use POI to denote our online tracker and KDNT to denote our offline tracker in submission.). We make our detection and appearance feature publicly available ( ). In this paper, we explore the high-performance detection and deep learning based appearance feature, and show that they lead to significantly better MOT results in both online and offline setting. Detection and learning based appearance feature play the central role in data association based multiple object tracking (MOT), but most recent MOT works usually ignore them and only focus on the hand-crafted feature and association algorithms.
0 Comments
Leave a Reply. |