16.6.2.3 Vision Language Tracking, Vision Language Object Tracking

Chapter Contents (Back)
Target Tracking. Vision Language.
See also Siamese Networks for Tracking.
See also Region, Object, Target Tracking.
See also Vision-Language Navigation.

Zhao, H.J.[Hao-Jie], Wang, X.[Xiao], Wang, D.[Dong], Lu, H.C.[Hu-Chuan], Ruan, X.[Xiang],
Transformer vision-language tracking via proxy token guided cross-modal fusion,
PRL(168), 2023, pp. 10-16.
Elsevier DOI 2304
Visual object tracking, Transformer, Vision-language BibRef

Zhang, G.T.[Guang-Tong], Zhong, B.N.[Bi-Neng], Liang, Q.H.[Qi-Hua], Mo, Z.Y.[Zhi-Yi], Li, N.[Ning], Song, S.X.[Shu-Xiang],
One-Stream Stepwise Decreasing for Vision-Language Tracking,
CirSysVideo(34), No. 10, October 2024, pp. 9053-9063.
IEEE DOI 2411
Visualization, Feature extraction, Target tracking, Automobiles, Natural languages, Roads, Information filters, Object tracking, vision-language tracking BibRef

Zheng, Y.Z.[Yao-Zong], Zhong, B.[Bineng], Liang, Q.H.[Qi-Hua], Li, G.R.[Guo-Rong], Ji, R.R.[Rong-Rong], Li, X.X.[Xian-Xian],
Toward Unified Token Learning for Vision-Language Tracking,
CirSysVideo(34), No. 4, April 2024, pp. 2125-2135.
IEEE DOI 2404
Task analysis, Target tracking, Visualization, Feature extraction, Pipelines, Linguistics, Training, Vision-language tracking, multi-modal modeling BibRef

Zhang, H.L.[Huan-Long], Wang, J.C.[Jing-Chao], Zhang, J.W.[Jian-Wei], Zhang, T.Z.[Tian-Zhu], Zhong, B.N.[Bi-Neng],
One-Stream Vision-Language Memory Network for Object Tracking,
MultMed(26), 2024, pp. 1720-1730.
IEEE DOI 2402
Target tracking, Visualization, Linguistics, Iron, Feature extraction, Computational modeling, Adaptation models, memory network BibRef

Yu, J.[Jun], Cai, Z.P.[Zhong-Peng], Li, Y.H.[Yi-Hao], Wang, L.[Lei], Gao, F.[Fang], Yu, Y.[Ye],
Language-Guided Dual-Modal Local Correspondence for Single Object Tracking,
MultMed(26), 2024, pp. 10637-10650.
IEEE DOI 2411
Target tracking, Visualization, Object tracking, Semantics, Task analysis, Natural languages, Feature extraction, transformer BibRef

Guo, M.Z.[Ming-Zhe], Zhang, Z.P.[Zhi-Peng], Jing, L.P.[Li-Ping], Ling, H.B.[Hai-Bin], Fan, H.[Heng],
Divert More Attention to Vision-Language Object Tracking,
PAMI(46), No. 12, December 2024, pp. 8600-8618.
IEEE DOI 2411
Visualization, Mixers, Annotations, Target tracking, Videos, Task analysis, Benchmark testing, Vision-language (VL) tracking, multimodal alignment BibRef

Ye, P.[Ping], Xiao, G.[Gang], Liu, J.[Jun],
Multimodal Features Alignment for Vision-Language Object Tracking,
RS(16), No. 7, 2024, pp. 1168.
DOI Link 2404
BibRef

Liang, Y.J.[Yan-Jie], Wu, Q.Q.[Qiang-Qiang], Cheng, L.[Lin], Xia, C.Q.[Chang-Qun], Li, J.[Jia],
Progressive Semantic-Visual Alignment and Refinement for Vision-Language Tracking,
CirSysVideo(35), No. 5, May 2025, pp. 4271-4286.
IEEE DOI 2505
Transformers, Target tracking, Visualization, Semantics, Feature extraction, Object tracking, Natural languages, channel communication patch interaction BibRef

Li, N.[Ning], Zhong, B.[Bineng], Liang, Q.H.[Qi-Hua], Mo, Z.Y.[Zhi-Yi], Nong, J.[Jian], Song, S.X.[Shu-Xiang],
SIEVL-Track: Exploring Semantic Information Enhancement for Visual-Language Object Tracking,
CirSysVideo(35), No. 6, June 2025, pp. 5872-5884.
IEEE DOI 2506
Semantics, Target tracking, Visualization, Transformers, Predictive models, Linguistics, Object tracking, Encoding, Accuracy BibRef

Xue, Y.L.[Yuan-Liang], Zhong, B.[Bineng], Jin, G.D.[Guo-Dong], Shen, T.[Tao], Tan, L.[Lining], Li, N.[Ning], Zheng, Y.Z.[Yao-Zong],
AVLTrack: Dynamic Sparse Learning for Aerial Vision-Language Tracking,
CirSysVideo(35), No. 8, August 2025, pp. 7554-7567.
IEEE DOI Code:
WWW Link. 2508
Target tracking, Visualization, Feature extraction, DSL, Transformers, Natural languages, Autonomous aerial vehicles, multi-level language perception BibRef

Lei, L.[Lei], Li, X.X.[Xian-Xian],
Multi-Modal Hybrid Interaction Vision-Language Tracking,
MultMed(27), 2025, pp. 5857-5865.
IEEE DOI 2509
Target tracking, Visualization, Feature extraction, Decoding, Data mining, Natural language processing, Object tracking, Head, vision-language BibRef

Zong, C.G.[Chen-Gao], Zhao, J.[Jie], Chen, X.[Xin], Lu, H.C.[Hu-Chuan], Wang, D.[Dong],
Learning Language Prompt for Vision-Language Tracking,
CirSysVideo(35), No. 9, September 2025, pp. 9287-9299.
IEEE DOI 2509
Target tracking, Visualization, Benchmark testing, Linguistics, Training, Grounding, Object tracking, Annotations, Data models, language annotation BibRef

Zhu, H.[Hong], Lu, Q.Y.[Qing-Yang], Xue, L.[Lei], Zhang, P.P.[Ping-Ping], Yuan, G.L.[Guang-Lin],
Vision-Language Tracking With CLIP and Interactive Prompt Learning,
ITS(26), No. 3, March 2025, pp. 3659-3670.
IEEE DOI 2503
Feature extraction, Target tracking, Visualization, Foundation models, Linguistics, Semantics, Computational modeling, layer-wise feature fusion BibRef


Li, X.H.[Xiao-Hai], Zhong, B.[Bineng], Liang, Q.H.[Qi-Hua], Mo, Z.Y.[Zhi-Yi], Nong, J.[Jian], Song, S.X.[Shu-Xiang],
Dynamic Updates for Language Adaptation in Visual-Language Tracking,
CVPR25(19165-19174)
IEEE DOI Code:
WWW Link. 2508
Visualization, Target tracking, Codes, Large language models, Semantics, Benchmark testing, Robustness, Frequency control, mutil-modal BibRef

Liu, X.[Xinqi], Zhou, L.[Li], Zhou, Z.[Zikun], Chen, J.Q.[Jian-Qiu], He, Z.Y.[Zhen-Yu],
MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking,
CVPR25(8731-8741)
IEEE DOI 2508
Adaptation models, Visualization, Target tracking, Heuristic algorithms, Face recognition, Refining, Transformers, state space model BibRef

Wei, H.K.[Hong-Kai], Yang, Y.[Yang], Sun, S.J.[Shi-Jie], Feng, M.T.[Ming-Tao], Song, X.Y.[Xiang-Yu], Lei, Q.[Qi], Hu, H.L.[Hong-Li], Wang, R.[Rong], Song, H.S.[Huan-Sheng], Akhtar, N.[Naveed], Mian, A.S.[Ajmal Saeed],
Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking,
CVPR25(13886-13896)
IEEE DOI Code:
WWW Link. 2508
Visualization, Solid modeling, Target tracking, Radar measurements, Video sequences, Radar tracking, Feature extraction, 3d single object tracking BibRef

Li, X.C.[Xu-Chen], Feng, X.K.[Xiao-Kun], Hu, S.Y.[Shi-Yu], Wu, M.[Meiqi], Zhang, D.L.[Dai-Ling], Zhang, J.[Jing], Huang, K.Q.[Kai-Qi],
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM,
VDU24(7283-7292)
IEEE DOI 2410
Visualization, Annotations, Semantics, Natural languages, Benchmark testing BibRef

Shao, Y.Y.[Yan-Yan], He, S.T.[Shu-Ting], Ye, Q.[Qi], Feng, Y.C.[Yu-Chao], Luo, W.H.[Wen-Han], Chen, J.M.[Ji-Ming],
Context-Aware Integration of Language and Visual References for Natural Language Tracking,
CVPR24(19208-19217)
IEEE DOI Code:
WWW Link. 2410
Visualization, Target tracking, Grounding, Video sequences, Merging, Modulation, Linguistics, natural language tracking, visual object tracking BibRef

Alansari, M.[Mohamad], Abughali, A.[Ahmed], Habash, O.[Obadah], Alnuaimi, K.[Khaled], Javed, S.[Sajid], Werghi, N.[Naoufel],
Integrating Vision-Language Supervision for Uniform Appearance Tracking,
ICIP24(747-752)
IEEE DOI 2411
Bridges, Visualization, Target tracking, Annotations, Natural languages, Benchmark testing, Visual Object Tracking BibRef

Feng, Q.[Qi], Ablavsky, V.[Vitaly], Bai, Q.[Qinxun], Sclaroff, S.[Stan],
Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers,
CVPR21(5847-5856)
IEEE DOI 2111
Visualization, Natural languages, Graphics processing units, Real-time systems BibRef

Feng, Q., Ablavsky, V., Bai, Q., Li, G., Sclaroff, S.,
Real-time Visual Object Tracking with Natural Language Description,
WACV20(689-698)
IEEE DOI 2006
Target tracking, Feature extraction, Natural languages, Convolution, Visualization, Proposals BibRef

Li, Z.Y.[Zhen-Yang], Tao, R., Gavves, E.[Efstratios], Snoek, C.G.M.[Cees G.M.], Smeulders, A.W.M.[Arnold W.M.],
Tracking by Natural Language Specification,
CVPR17(7350-7358)
IEEE DOI 1711
Image segmentation, Man-machine systems, Natural languages, Target tracking, Visualization BibRef

Chapter on Motion -- Feature-Based, Long Range, Motion and Structure Estimates, Tracking, Surveillance, Activities continues in
Tracking using Neural Nets, Learning .


Last update:Sep 27, 2025 at 16:28:57