Topic Editors

Dr. Shunli Zhang
School of Software Engineering, Beijing Jiaotong University, 100044 Beijing, China
Dr. Xin Yu
School of Computer Science, University of Technology Sydney, Australia
School of Information and Control, Nanjing University of Information Science and Technology, Nanjing, China
Dr. Yang Yang
School of Information Science and Engineering, Shandong University, Qingdao, China

Visual Object Tracking: Challenges and Applications

Abstract submission deadline
closed (31 August 2023)
Manuscript submission deadline
closed (31 October 2023)
Viewed by
15360

Topic Information

Dear Colleagues,

Visual tracking aims to locate the target specified in the initial frame, which has many realistic applications such as video surveillance, augment reality, and behavior analysis. In spite of numerous efforts, this is still a challenging task due to factors such as deformation, illumination change, rotation, and occlusion, to name a few. This Topic promotes scientific dialogue for the added value of novel methodological approaches and research in the specified areas. Our interest is in the entire end-to-end spectrum of visual object tracking research, from motion estimation, appearance representation, strategic frameworks, models, and best practices to sophisticated research related to radical innovation. The topics of interest include but are not limited to the following indicative list:

  • Enabling Technologies for visual object tracking research:
    • Machine learning;
    • Neural networks;
    • Image processing;
    • Bot technology;
    • AI agents;
    • Reinforcement learning;
    • Edge computing;
  • Methodologies, frameworks, and models for artificial intelligence and visual object tracking research:
    • For innovations in business, research, academia industry, and technology;
    • For theoretical foundations and contributions to the body of knowledge of visual object tracking;
  • Best practices and use cases;
  • Outcomes of R&D projects;
  • Industry-government collaboration;
  • Security and privacy issues;
  • Ethics on visual object tracking and AI;
  • Social impact of AI.

Dr. Shunli Zhang
Dr. Xin Yu
Prof. Dr. Kaihua Zhang
Dr. Yang Yang
Topic Editors

Keywords

  • artificial intelligence
  • computer vision
  • visual object tracking
  • reinforcement learning
  • deep learning
  • feature extraction
  • trajectory prediction

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Applied Sciences
applsci
2.5 5.3 2011 17.8 Days CHF 2400
Electronics
electronics
2.6 5.3 2012 16.8 Days CHF 2400
Journal of Imaging
jimaging
2.7 5.9 2015 20.9 Days CHF 1800
Sensors
sensors
3.4 7.3 2001 16.8 Days CHF 2600
Signals
signals
- 3.2 2020 26.1 Days CHF 1000

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (7 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
19 pages, 9863 KiB  
Article
Cross-Video Pedestrian Tracking Algorithm with a Coordinate Constraint
by Cheng Huang, Weihong Li, Guang Yang, Jiachen Yan, Baoding Zhou and Yujun Li
Sensors 2024, 24(3), 779; https://doi.org/10.3390/s24030779 - 25 Jan 2024
Viewed by 1123
Abstract
Pedestrian tracking in surveillance videos is crucial and challenging for precise personnel management. Due to the limited coverage of a single video, the integration of multiple surveillance videos is necessary in practical applications. In the realm of pedestrian management using multiple surveillance videos, [...] Read more.
Pedestrian tracking in surveillance videos is crucial and challenging for precise personnel management. Due to the limited coverage of a single video, the integration of multiple surveillance videos is necessary in practical applications. In the realm of pedestrian management using multiple surveillance videos, continuous pedestrian tracking is quite important. However, prevailing cross-video pedestrian matching methods mainly rely on the appearance features of pedestrians, resulting in low matching accuracy and poor tracking robustness. To address these shortcomings, this paper presents a cross-video pedestrian tracking algorithm, which introduces spatial information. The proposed algorithm introduces the coordinate features of pedestrians in different videos and a linear weighting strategy focusing on the overlapping view of the tracking process. The experimental results show that, compared to traditional methods, the method in this paper improves the success rate of target pedestrian matching and enhances the robustness of continuous pedestrian tracking. This study provides a viable reference for pedestrian tracking and crowd management in video applications. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

16 pages, 3123 KiB  
Article
DetTrack: An Algorithm for Multiple Object Tracking by Improving Occlusion Object Detection
by Xinyue Gao, Zhengyou Wang, Xiaofan Wang, Shuo Zhang, Shanna Zhuang and Hui Wang
Electronics 2024, 13(1), 91; https://doi.org/10.3390/electronics13010091 - 25 Dec 2023
Cited by 1 | Viewed by 3372
Abstract
Multi-object tracking (MOT) is an important problem in computer vision that has a wide range of applications. Currently, object occlusion detecting is still a serious challenge in multi-object tracking tasks. In this paper, we propose a method to simultaneously improve occluded object detection [...] Read more.
Multi-object tracking (MOT) is an important problem in computer vision that has a wide range of applications. Currently, object occlusion detecting is still a serious challenge in multi-object tracking tasks. In this paper, we propose a method to simultaneously improve occluded object detection and occluded object tracking, as well as propose a tracking method for when the object is completely occluded. First, motion track prediction is utilized to improve the upper limit of occluded object detection. Then, the spatio-temporal feature information between the object and the surrounding environment is used for multi-object tracking. Finally, we use the hypothesis frame to continuously track the completely occluded object. Our study shows that we achieve competitive performances compared to the current state-of-the-art methods on popular multi-object tracking benchmarks such as MOT16, MOT17, and MOT20. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

15 pages, 7604 KiB  
Article
Optical Particle Visualization Technique Using Red–Green–Blue and Core Storage Shed Flow Field Analysis
by Mok-Lyang Cho and Ji-Soo Ha
Appl. Sci. 2023, 13(19), 10997; https://doi.org/10.3390/app131910997 - 5 Oct 2023
Viewed by 1133
Abstract
This study uses a flow visualization method to analyze the flow field of a shed-type coal storage shed, comparing and verifying the findings through numerical calculation. Initially, a coal warehouse-scale model is created for flow visualization. Laser-based cross-sectional analysis yields essential flow data, [...] Read more.
This study uses a flow visualization method to analyze the flow field of a shed-type coal storage shed, comparing and verifying the findings through numerical calculation. Initially, a coal warehouse-scale model is created for flow visualization. Laser-based cross-sectional analysis yields essential flow data, from which red–green–blue values are extracted, and the flow object with the highest G value is selected. Subsequently, as the video frame changes, the moving object is tracked, and the direction is derived. The velocity vector of the moving object within the designated area is derived. Finally, we compare the results of the flow visualization experiment with the simulation outcome. Notably, the error rate in regions characterized by high flow velocity is found to be low, and a high implementation rate is observed in areas with many floating objects to track. Conversely, implementation accuracy is lower in low-velocity fields. Both methods result in a recirculation zone at the top of the inlet, and a flow stagnation region occurs on the upper part of the central wall. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

16 pages, 3206 KiB  
Article
SiamUT: Siamese Unsymmetrical Transformer-like Tracking
by Lingyu Yang, Hao Zhou, Guowu Yuan, Mengen Xia, Dong Chen, Zhiliang Shi and Enbang Chen
Electronics 2023, 12(14), 3133; https://doi.org/10.3390/electronics12143133 - 19 Jul 2023
Viewed by 1125
Abstract
Siamese networks have proven to be suitable for many computer vision tasks, including single object tracking. These trackers leverage the siamese structure to benefit from feature cross-correlation, which measures the similarity between a target template and the corresponding search region. However, the linear [...] Read more.
Siamese networks have proven to be suitable for many computer vision tasks, including single object tracking. These trackers leverage the siamese structure to benefit from feature cross-correlation, which measures the similarity between a target template and the corresponding search region. However, the linear nature of the correlation operation leads to the loss of important semantic information and may result in suboptimal performance when faced with complex background interference or significant object deformations. In this paper, we introduce the Transformer structure, which has been successful in vision tasks, to enhance the siamese network’s performance in challenging conditions. By incorporating self-attention and cross-attention mechanisms, we modify the original Transformer into an asymmetrical version that can focus on different regions of the feature map. This transformer-like fusion network enables more efficient and effective fusion procedures. Additionally, we introduce a two-layer output structure with decoupling prediction heads, improved loss functions, and window penalty post-processing. This design enhances the performance of both the classification and the regression branches. Extensive experiments conducted on large public datasets such as LaSOT, GOT-10k, and TrackingNet demonstrate that our proposed SiamUT tracker achieves state-of-the-art precision performance on most benchmark datasets. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

17 pages, 5941 KiB  
Article
Motion Vector Extrapolation for Video Object Detection
by Julian True and Naimul Khan
J. Imaging 2023, 9(7), 132; https://doi.org/10.3390/jimaging9070132 - 29 Jun 2023
Cited by 1 | Viewed by 2279
Abstract
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this [...] Read more.
Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state of the art in object detection models. This work presents motion vector extrapolation (MOVEX), a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow-based motion estimation techniques in parallel. This work demonstrates that this approach significantly reduces the baseline latency of any given object detector without sacrificing accuracy performance. Further latency reductions up to 24 times lower than the original latency can be achieved with minimal accuracy loss. MOVEX enables low-latency video object detection on common CPU-based systems, thus allowing for high-performance video object detection beyond the domain of GPU computing. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

20 pages, 7143 KiB  
Article
A Target Re-Identification Method Based on Shot Boundary Object Detection for Single Object Tracking
by Bingchen Miao, Zengzhao Chen, Hai Liu and Aijun Zhang
Appl. Sci. 2023, 13(11), 6422; https://doi.org/10.3390/app13116422 - 24 May 2023
Cited by 5 | Viewed by 1923
Abstract
With the advantages of simple model structure and performance-speed balance, the single object tracking (SOT) model based on a Transformer has become a hot topic in the current object tracking field. However, the tracking errors caused by the target leaving the shot, namely [...] Read more.
With the advantages of simple model structure and performance-speed balance, the single object tracking (SOT) model based on a Transformer has become a hot topic in the current object tracking field. However, the tracking errors caused by the target leaving the shot, namely the target out-of-view, are more likely to occur in videos than we imagine. To address this issue, we proposed a target re-identification method for SOT called TRTrack. First, we built a bipartite matching model of candidate tracklets and neighbor tracklets optimized by the Hopcroft–Karp algorithm, which is used for preliminary tracking and judging the target leaves the shot. It achieves 76.3% mAO on the tracking benchmark Generic Object Tracking-10k (GOT-10k). Then, we introduced the alpha-IoU loss function in YOLOv5-DeepSORT to detect the shot boundary objects and attained 38.62% mAP75:95 on Microsoft Common Objects in Context 2017 (MS COCO 2017). Eventually, we designed a backtracking identification module in TRTrack to re-identify the target. Experimental results confirmed the effectiveness of our method, which is superior to most of the state-of-the-art models. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

18 pages, 6149 KiB  
Article
Global Context Attention for Robust Visual Tracking
by Janghoon Choi
Sensors 2023, 23(5), 2695; https://doi.org/10.3390/s23052695 - 1 Mar 2023
Cited by 2 | Viewed by 1846
Abstract
Although there have been recent advances in Siamese-network-based visual tracking methods where they show high performance metrics on numerous large-scale visual tracking benchmarks, persistent challenges regarding the distractor objects with similar appearances to the target object still remain. To address these aforementioned issues, [...] Read more.
Although there have been recent advances in Siamese-network-based visual tracking methods where they show high performance metrics on numerous large-scale visual tracking benchmarks, persistent challenges regarding the distractor objects with similar appearances to the target object still remain. To address these aforementioned issues, we propose a novel global context attention module for visual tracking, where the proposed module can extract and summarize the holistic global scene information to modulate the target embedding for improved discriminability and robustness. Our global context attention module receives a global feature correlation map to elicit the contextual information from a given scene and generates the channel and spatial attention weights to modulate the target embedding to focus on the relevant feature channels and spatial parts of the target object. Our proposed tracking algorithm is tested on large-scale visual tracking datasets, where we show improved performance compared to the baseline tracking algorithm while achieving competitive performance with real-time speed. Additional ablation experiments also validate the effectiveness of the proposed module, where our tracking algorithm shows improvements in various challenging attributes of visual tracking. Full article
(This article belongs to the Topic Visual Object Tracking: Challenges and Applications)
Show Figures

Figure 1

Back to TopTop