Visual7W visual question answering,
Large-scale visual question answering (QA) dataset, with object-level
groundings and multimodal answers.
WWW Link.
Dataset, Visual Question Answering.
Liang, J.W.[Jun-Wei],
Jiang, L.[Lu],
Cao, L.L.[Liang-Liang],
Kalantidis, Y.[Yannis],
Li, L.J.[Li-Jia],
Hauptmann, A.G.[Alexander G.],
Focal Visual-Text Attention for Memex Question Answering,
PAMI(41), No. 8, August 2019, pp. 1893-1908.
IEEE DOI
1907
BibRef
Earlier: A1, A2, A3, A5, A6, Only:
Focal Visual-Text Attention for Visual Question Answering,
CVPR18(6135-6143)
IEEE DOI
1812
Task analysis, Knowledge discovery, Visualization, Grounding,
Metadata, Cognition, Photo albums, question answering,
memex.
Visualization, Videos, Computational modeling, Correlation.
BibRef
Riquelme, F.[Felipe],
de Goyeneche, A.[Alfredo],
Zhang, Y.D.[Yun-Dong],
Niebles, J.C.[Juan Carlos],
Soto, A.[Alvaro],
Explaining VQA predictions using visual grounding and a knowledge
base,
IVC(101), 2020, pp. 103968.
Elsevier DOI
2009
Deep Learning, Attention, Supervision, Knowledge Base,
Interpretability, Explainability
BibRef
Niu, Y.L.[Yu-Lei],
Zhang, H.W.[Han-Wang],
Lu, Z.W.[Zhi-Wu],
Chang, S.F.[Shih-Fu],
Variational Context: Exploiting Visual and Textual Context for
Grounding Referring Expressions,
PAMI(43), No. 1, January 2021, pp. 347-359.
IEEE DOI
2012
Grounding, Context modeling, Visualization, Task analysis,
Pediatrics, Bayes methods, Annotations,
referring expression generation
BibRef
Yang, S.[Sibei],
Li, G.B.[Guan-Bin],
Yu, Y.Z.[Yi-Zhou],
Relationship-Embedded Representation Learning for Grounding Referring
Expressions,
PAMI(43), No. 8, August 2021, pp. 2765-2779.
IEEE DOI
2107
BibRef
Earlier:
Cross-Modal Relationship Inference for Grounding Referring Expressions,
CVPR19(4140-4149).
IEEE DOI
2002
Locate the object instance in an image described by a referring expression.
Visualization, Semantics, Grounding, Proposals, Data mining,
Logic gates, Feature extraction, Referring expressions,
gated graph convolutional network.
Locate target object based on natural language descriptions.
BibRef
Yang, Z.Y.[Zheng-Yuan],
Kumar, T.[Tushar],
Chen, T.L.[Tian-Lang],
Su, J.S.[Jing-Song],
Luo, J.B.[Jie-Bo],
Grounding-Tracking-Integration,
CirSysVideo(31), No. 9, September 2021, pp. 3433-3443.
IEEE DOI
2109
Grounding, Target tracking, Visualization, History, Task analysis,
Object tracking, Annotations, Tracking by language
BibRef
Zhang, W.X.[Wei-Xia],
Ma, C.[Chao],
Wu, Q.[Qi],
Yang, X.K.[Xiao-Kang],
Language-Guided Navigation via Cross-Modal Grounding and Alternate
Adversarial Learning,
CirSysVideo(31), No. 9, September 2021, pp. 3469-3481.
IEEE DOI
2109
Navigation, Training, Trajectory, Visualization, Task analysis,
Grounding, Generators, Vision-and-language, embodied navigation,
adversarial learning
BibRef
Zhai, S.L.[Song-Lin],
Guo, G.B.[Gui-Bing],
Yuan, F.J.[Fa-Jie],
Liu, Y.[Yuan],
Wang, X.W.[Xing-Wei],
VSE-fs: Fast Full-Sample Visual Semantic Embedding,
IEEE_Int_Sys(36), No. 4, July 2021, pp. 3-12.
IEEE DOI
2109
Construct a joint embedding space between visual features and semantic
information.
Computational modeling, Training, Integrated circuits,
Time complexity, Semantics, Visualization, Intelligent systems,
Negative Sampling
BibRef
Sun, M.J.[Ming-Jie],
Xiao, J.[Jimin],
Lim, E.G.[Eng Gee],
Liu, S.[Si],
Goulermas, J.Y.[John Y.],
Discriminative Triad Matching and Reconstruction for Weakly Referring
Expression Grounding,
PAMI(43), No. 11, November 2021, pp. 4189-4195.
IEEE DOI
2110
Image reconstruction, Training, Proposals, Visualization,
Task analysis, Linguistics, Grounding, discriminative triad matching
BibRef
Bargal, S.A.[Sarah Adel],
Zunino, A.[Andrea],
Petsiuk, V.[Vitali],
Zhang, J.M.[Jian-Ming],
Saenko, K.[Kate],
Murino, V.[Vittorio],
Sclaroff, S.[Stan],
Guided Zoom: Zooming into Network Evidence to Refine Fine-Grained
Model Decisions,
PAMI(43), No. 11, November 2021, pp. 4196-4202.
IEEE DOI
2110
Grounding, Training, Predictive models, Annotations,
Location awareness, Correlation, Visualization, Explainable AI,
convolutional neural networks
BibRef
Yang, W.F.[Wen-Fei],
Zhang, T.Z.[Tian-Zhu],
Zhang, Y.D.[Yong-Dong],
Wu, F.[Feng],
Local Correspondence Network for Weakly Supervised Temporal Sentence
Grounding,
IP(30), 2021, pp. 3252-3262.
IEEE DOI
2103
Grounding, Annotations, Training,
Feature extraction, Computational modeling, Task analysis,
temporal sentence grounding
BibRef
Luo, W.[Wang],
Zhang, T.Z.[Tian-Zhu],
Yang, W.[Wenfei],
Liu, J.G.[Jin-Gen],
Mei, T.[Tao],
Wu, F.[Feng],
Zhang, Y.D.[Yong-Dong],
Action Unit Memory Network for Weakly Supervised Temporal Action
Localization,
CVPR21(9964-9974)
IEEE DOI
2111
Location awareness, Training, Knowledge engineering,
Motion segmentation, Refining, Interference, Benchmark testing
BibRef
Hong, R.[Richang],
Liu, D.[Daqing],
Mo, X.Y.[Xiao-Yu],
He, X.N.[Xiang-Nan],
Zhang, H.[Hanwang],
Learning to Compose and Reason with Language Tree Structures for
Visual Grounding,
PAMI(44), No. 2, February 2022, pp. 684-696.
IEEE DOI
2201
Grounding, Visualization, Dogs, Natural languages, Cognition,
Computational modeling, Semantics, Fine-grained detection,
visual reasoning
BibRef
Bin, Y.[Yi],
Ding, Y.J.[Yu-Juan],
Peng, B.[Bo],
Peng, L.[Liang],
Yang, Y.[Yang],
Chua, T.S.[Tat-Seng],
Entity Slot Filling for Visual Captioning,
CirSysVideo(32), No. 1, January 2022, pp. 52-62.
IEEE DOI
2201
Task analysis, Visualization, Neural networks, Adaptation models,
Filling, Grounding, Training, Image captioning,
dataset
BibRef
Chu, C.[Chenhui],
Oliveira, V.[Vinicius],
Virgo, F.G.[Felix Giovanni],
Otani, M.[Mayu],
Garcia, N.[Noa],
Nakashima, Y.[Yuta],
The semantic typology of visually grounded paraphrases,
CVIU(215), 2022, pp. 103333.
Elsevier DOI
2201
Vision and language, Image interpretation,
Visual grounded paraphrases, Semantic typology, Dataset
BibRef
Deng, C.R.[Chao-Rui],
Wu, Q.[Qi],
Wu, Q.Y.[Qing-Yao],
Hu, F.Y.[Fu-Yuan],
Lyu, F.[Fan],
Tan, M.K.[Ming-Kui],
Visual Grounding Via Accumulated Attention,
PAMI(44), No. 3, March 2022, pp. 1670-1684.
IEEE DOI
2202
BibRef
Earlier:
CVPR18(7746-7755)
IEEE DOI
1812
Task analysis, Grounding, Cognition, Visual grounding,
bounding box regression.
Visualization, Feature extraction, Grounding, Natural languages,
Redundancy, Task analysis, Computational modeling
BibRef
Plummer, B.A.[Bryan A.],
Shih, K.J.[Kevin J.],
Li, Y.C.[Yi-Chen],
Xu, K.[Ke],
Lazebnik, S.[Svetlana],
Sclaroff, S.[Stan],
Saenko, K.[Kate],
Revisiting Image-Language Networks for Open-Ended Phrase Detection,
PAMI(44), No. 4, April 2022, pp. 2155-2167.
IEEE DOI
2203
Task analysis, Grounding, Visualization, Feature extraction,
Benchmark testing, Detectors, Vocabulary, Vision and language,
representation learning
BibRef
Burns, A.[Andrea],
Tan, R.[Reuben],
Saenko, K.[Kate],
Sclaroff, S.[Stan],
Plummer, B.A.[Bryan A.],
Language Features Matter: Effective Language Representations for
Vision-Language Tasks,
ICCV19(7473-7482)
IEEE DOI
2004
Code, Visualization.
WWW Link. data visualisation, graph theory, image representation,
learning (artificial intelligence), Grounding
BibRef
Arbelle, A.[Assaf],
Doveh, S.[Sivan],
Alfassy, A.[Amit],
Shtok, J.[Joseph],
Lev, G.[Guy],
Schwartz, E.[Eli],
Kuehne, H.[Hilde],
Levi, H.B.[Hila Barak],
Sattigeri, P.[Prasanna],
Panda, R.[Rameswar],
Chen, C.F.[Chun-Fu],
Bronstein, A.M.[Alex M.],
Saenko, K.[Kate],
Ullman, S.[Shimon],
Giryes, R.[Raja],
Feris, R.[Rogerio],
Karlinsky, L.[Leonid],
Detector-Free Weakly Supervised Grounding by Separation,
ICCV21(1781-1792)
IEEE DOI
2203
Training, Location awareness, Visualization, Image segmentation,
Grounding, Genomics, Detectors, Vision + language,
BibRef
Whitehead, S.[Spencer],
Wu, H.[Hui],
Ji, H.[Heng],
Feris, R.[Rogerio],
Saenko, K.[Kate],
Separating Skills and Concepts for Novel Visual Question Answering,
CVPR21(5628-5637)
IEEE DOI
2111
Training, Visualization, Grounding, Annotations,
Knowledge discovery, Encoding
BibRef
Yu, X.T.[Xin-Tong],
Zhang, H.M.[Hong-Ming],
Hong, R.X.[Rui-Xin],
Song, Y.Q.[Yang-Qiu],
Zhang, C.S.[Chang-Shui],
VD-PCR: Improving visual dialog with pronoun coreference resolution,
PR(125), 2022, pp. 108540.
Elsevier DOI
2203
Vision and language, Visual dialog, Pronoun coreference resolution
BibRef
Yuan, Y.T.[Yi-Tian],
Ma, L.[Lin],
Wang, J.W.[Jing-Wen],
Liu, W.[Wei],
Zhu, W.[Wenwu],
Semantic Conditioned Dynamic Modulation for Temporal Sentence
Grounding in Videos,
PAMI(44), No. 5, May 2022, pp. 2725-2741.
IEEE DOI
2204
Videos, Grounding, Semantics, Proposals, Task analysis, Convolution,
Visualization, Temporal sentence grounding in videos (TSG),
temporal convolution
BibRef
Lin, L.[Liang],
Yan, P.X.[Peng-Xiang],
Xu, X.Q.[Xiao-Qian],
Yang, S.[Sibei],
Zeng, K.[Kun],
Li, G.B.[Guan-Bin],
Structured Attention Network for Referring Image Segmentation,
MultMed(24), No. 2022, pp. 1922-1932.
IEEE DOI
2204
Visualization, Linguistics, Image segmentation, Cognition,
Feature extraction, Semantics, Task analysis,
cross-modal reasoning
BibRef
Yang, X.[Xu],
Wang, H.[Hao],
Xie, D.[De],
Deng, C.[Cheng],
Tao, D.C.[Da-Cheng],
Object-Agnostic Transformers for Video Referring Segmentation,
IP(31), No. 2022, pp. 2839-2849.
IEEE DOI
2204
Task analysis, Visualization, Transformers, Feature extraction,
Object detection, Image segmentation, Context modeling,
video grounding
BibRef
He, S.[Su],
Yang, X.F.[Xiao-Feng],
Lin, G.S.[Guo-Sheng],
Learning language to symbol and language to vision mapping for visual
grounding,
IVC(122), 2022, pp. 104451.
Elsevier DOI
2205
Cross modality, Visual grounding, Neural symbolic reasoning
BibRef
Jiang, W.H.[Wen-Hui],
Zhu, M.[Minwei],
Fang, Y.M.[Yu-Ming],
Shi, G.M.[Guang-Ming],
Zhao, X.W.[Xiao-Wei],
Liu, Y.[Yang],
Visual Cluster Grounding for Image Captioning,
IP(31), 2022, pp. 3920-3934.
IEEE DOI
2206
Grounding, Visualization, Proposals, Annotations, Transformers,
Task analysis, Decoding, Image captioning, attention evaluation,
grounding supervision
BibRef
Liao, Y.[Yue],
Zhang, A.[Aixi],
Chen, Z.Y.[Zhi-Yuan],
Hui, T.R.[Tian-Rui],
Liu, S.[Si],
Progressive Language-Customized Visual Feature Learning for One-Stage
Visual Grounding,
IP(31), 2022, pp. 4266-4277.
IEEE DOI
2207
Visualization, Feature extraction, Grounding, Linguistics,
Task analysis, Detectors, Representation learning,
cross-modal fusion
BibRef
Ding, X.P.[Xin-Peng],
Wang, N.N.[Nan-Nan],
Zhang, S.W.[Shi-Wei],
Huang, Z.Y.[Zi-Yuan],
Li, X.M.[Xiao-Meng],
Tang, M.Q.[Ming-Qian],
Liu, T.L.[Tong-Liang],
Gao, X.B.[Xin-Bo],
Exploring Language Hierarchy for Video Grounding,
IP(31), 2022, pp. 4693-4706.
IEEE DOI
2207
Proposals, Grounding, Training, Location awareness, Task analysis,
Semantics, Feature extraction, Video and language, language hierarchy
BibRef
Wang, Y.[Yuechen],
Deng, J.J.[Jia-Jun],
Zhou, W.G.[Wen-Gang],
Li, H.Q.[Hou-Qiang],
Weakly Supervised Temporal Adjacent Network for Language Grounding,
MultMed(24), 2022, pp. 3276-3286.
IEEE DOI
2207
Grounding, Semantics, Feature extraction, Visualization,
Task analysis, Annotations, Training, Temporal language grounding,
multiple instance learning
BibRef
Xu, Z.[Zhe],
Chen, D.[Da],
Wei, K.[Kun],
Deng, C.[Cheng],
Xue, H.[Hui],
HiSA: Hierarchically Semantic Associating for Video Temporal
Grounding,
IP(31), 2022, pp. 5178-5188.
IEEE DOI
2208
Grounding, Feature extraction, Proposals, Task analysis, Semantics,
Representation learning, Image segmentation,
cross-guided contrast
BibRef
Wang, X.[Xing],
Xie, D.[De],
Zheng, Y.[Yuanshi],
Referring expression grounding by multi-context reasoning,
PRL(160), 2022, pp. 66-72.
Elsevier DOI
2208
Referring expression grounding, Reasoning, Graph networks
BibRef
Gao, J.L.[Jia-Lin],
Sun, X.[Xin],
Ghanem, B.[Bernard],
Zhou, X.[Xi],
Ge, S.M.[Shi-Ming],
Efficient Video Grounding With Which-Where Reading Comprehension,
CirSysVideo(32), No. 10, October 2022, pp. 6900-6913.
IEEE DOI
2210
Grounding, Proposals, Visualization, Location awareness,
Task analysis, Reinforcement learning, Germanium, deep learning
BibRef
Zhou, H.[Hao],
Zhang, C.Y.[Chong-Yang],
Luo, Y.[Yan],
Hu, C.P.[Chuan-Ping],
Zhang, W.J.[Wen-Jun],
Thinking Inside Uncertainty: Interest Moment Perception for Diverse
Temporal Grounding,
CirSysVideo(32), No. 10, October 2022, pp. 7190-7203.
IEEE DOI
2210
Annotations, Grounding, Task analysis, Uncertainty, Measurement,
Predictive models, Optimization, Temporal grounding, label uncertainty
BibRef
Shen, H.T.[Heng Tao],
Chen, C.[Cheng],
Wang, P.[Peng],
Gao, L.L.[Lian-Li],
Wang, M.[Meng],
Song, J.K.[Jing-Kuan],
Continual Referring Expression Comprehension via Dual Modular
Memorization,
IP(31), 2022, pp. 6694-6706.
IEEE DOI
2211
Task analysis, Training, Benchmark testing, Training data, Grounding,
Data models, Visualization, Continual learning, lifelong learning,
visual grounding
BibRef
Tang, Z.H.[Zong-Heng],
Liao, Y.[Yue],
Liu, S.[Si],
Li, G.B.[Guan-Bin],
Jin, X.J.[Xiao-Jie],
Jiang, H.X.[Hong-Xu],
Yu, Q.[Qian],
Xu, D.[Dong],
Human-Centric Spatio-Temporal Video Grounding With Visual
Transformers,
CirSysVideo(32), No. 12, December 2022, pp. 8238-8249.
IEEE DOI
2212
Grounding, Visualization, Electron tubes, Location awareness,
Power transformers, Spatial temporal resolution, dataset
BibRef
Tang, H.Y.[Hao-Yu],
Zhu, J.[Jihua],
Wang, L.[Lin],
Zheng, Q.H.[Qing-Hai],
Zhang, T.W.[Tian-Wei],
Multi-Level Query Interaction for Temporal Language Grounding,
ITS(23), No. 12, December 2022, pp. 25479-25488.
IEEE DOI
2212
Semantics, Task analysis, Grounding, Proposals, Syntactics,
Location awareness, Feature extraction, Human-machine interface,
multi-level interaction
BibRef
Suo, W.[Wei],
Sun, M.Y.[Meng-Yang],
Wang, P.[Peng],
Zhang, Y.N.[Yan-Ning],
Wu, Q.[Qi],
Rethinking and Improving Feature Pyramids for One-Stage Referring
Expression Comprehension,
IP(32), 2023, pp. 854-864.
IEEE DOI
2301
Task analysis, Visualization, Head, Semantics, Object detection, Neck,
Computational modeling, Referring expression comprehension,
feature pyramids network
BibRef
Liu, X.J.[Xue-Jing],
Li, L.[Liang],
Wang, S.H.[Shu-Hui],
Zha, Z.J.[Zheng-Jun],
Li, Z.C.[Ze-Chao],
Tian, Q.[Qi],
Huang, Q.M.[Qing-Ming],
Entity-Enhanced Adaptive Reconstruction Network for Weakly Supervised
Referring Expression Grounding,
PAMI(45), No. 3, March 2023, pp. 3003-3018.
IEEE DOI
2302
Proposals, Image reconstruction, Grounding, Visualization,
Collaboration, Context modeling, Training, Entity enhancement,
referring expression grounding
BibRef
Liu, X.J.[Xue-Jing],
Li, L.[Liang],
Wang, S.H.[Shu-Hui],
Zha, Z.J.[Zheng-Jun],
Meng, D.C.[De-Chao],
Huang, Q.M.[Qing-Ming],
Adaptive Reconstruction Network for Weakly Supervised Referring
Expression Grounding,
ICCV19(2611-2620)
IEEE DOI
2004
Localize the object in the image from a query.
feature extraction, image classification, image reconstruction,
image retrieval, Adaptive systems
BibRef
Wang, W.[Wei],
Gao, J.Y.[Jun-Yu],
Xu, C.S.[Chang-Sheng],
Weakly-Supervised Video Object Grounding via Causal Intervention,
PAMI(45), No. 3, March 2023, pp. 3933-3948.
IEEE DOI
2302
Grounding, Visualization, Task analysis, Dairy products, Annotations,
Context modeling, Proposals, Weakly-supervised learning,
adversarial contrastive learning
BibRef
Feng, G.[Guang],
Zhang, L.[Lihe],
Sun, J.[Jiayu],
Hu, Z.W.[Zhi-Wei],
Lu, H.C.[Hu-Chuan],
Referring Segmentation via Encoder-Fused Cross-Modal Attention
Network,
PAMI(45), No. 6, June 2023, pp. 7654-7667.
IEEE DOI
2305
BibRef
Earlier: A1, A4, A2, A5, Only:
Encoder Fusion Network with Co-Attention Embedding for Referring
Image Segmentation,
CVPR21(15501-15510)
IEEE DOI
2111
Visualization, Image segmentation, Decoding, Feature extraction,
Linguistics, Task analysis, Correlation, Referring segmentation,
asymmetric cross-frame attention module.
Measurement, Visualization, Grounding, Semantics,
Transforms, Information representation
BibRef
Liu, D.[Daizong],
Zhou, P.[Pan],
Xu, Z.[Zichuan],
Wang, H.Z.[Hao-Zhao],
Li, R.[Ruixuan],
Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic
Learning,
CirSysVideo(33), No. 5, May 2023, pp. 2491-2505.
IEEE DOI
2305
Semantics, Grounding, Task analysis, Training, Visualization,
Proposals, Logic gates, Temporal sentence grounding,
memory-augmented network
BibRef
Nayyeri, M.[Mojtaba],
Xu, C.J.[Cheng-Jin],
Alam, M.M.[Mirza Mohtashim],
Lehmann, J.[Jens],
Yazdi, H.S.[Hamed Shariat],
LogicENN: A Neural Based Knowledge Graphs Embedding Model With
Logical Rules,
PAMI(45), No. 6, June 2023, pp. 7050-7062.
IEEE DOI
2305
Encoding, Grounding, Computational modeling, Analytical models,
Task analysis, Optimization, Predictive models, Knowledge graph,
representation learning
BibRef
Ho, C.H.[Chih-Hui],
Appalaraju, S.[Srikar],
Jasani, B.[Bhavan],
Manmatha, R.,
Vasconcelos, N.M.[Nuno M.],
YORO - Lightweight End to End Visual Grounding,
CMMP22(3-23).
Springer DOI
2304
BibRef
Kim, D.[Dahye],
Park, J.[Jungin],
Lee, J.Y.[Ji-Young],
Park, S.[Seongheon],
Sohn, K.H.[Kwang-Hoon],
Language-free Training for Zero-shot Video Grounding,
WACV23(2538-2547)
IEEE DOI
2302
Training, Visualization, Grounding, Annotations, Natural languages, Standards
BibRef
Le, T.M.[Thao Minh],
Le, V.[Vuong],
Gupta, S.I.[Sun-Il],
Venkatesh, S.[Svetha],
Tran, T.[Truyen],
Guiding Visual Question Answering with Attention Priors,
WACV23(4370-4379)
IEEE DOI
2302
Training, Visualization, Systematics, Grounding, Semantics,
Linguistics, Cognition, visual reasoning)
BibRef
Chou, S.H.[Shih-Han],
Fan, Z.C.[Zi-Cong],
Little, J.J.[James J.],
Sigal, L.[Leonid],
Semi-Supervised Grounding Alignment for Multi-Modal Feature Learning,
CRV22(48-57)
IEEE DOI
2301
Representation learning, Training, Visualization, Grounding,
Annotations, Benchmark testing, grounding, VCR
BibRef
Gupta, K.[Kshitij],
Gautam, D.[Devansh],
Mamidi, R.[Radhika],
cViL: Cross-Lingual Training of Vision-Language Models using
Knowledge Distillation,
ICPR22(1734-1741)
IEEE DOI
2212
Training, Visualization, Analytical models, Pipelines, Transformers,
Question answering (information retrieval), Data models
BibRef
Chen, D.Z.Y.[Dave Zhen-Yu],
Wu, Q.R.[Qi-Rui],
Nießner, M.[Matthias],
Chang, A.X.[Angel X.],
D 3 Net: A Unified Speaker-Listener Architecture for
3D Dense Captioning and Visual Grounding,
ECCV22(XXXII:487-505).
Springer DOI
2211
BibRef
Parcalabescu, L.,
Frank, A.,
Exploring Phrase Grounding without Training: Contextualisation and
Extension to Text-Based Image Retrieval,
MULWS20(4137-4146)
IEEE DOI
2008
Grounding, Visualization, Detectors, Task analysis, Linguistics,
Proposals, Training
BibRef
Tung, H.,
Harley, A.W.,
Huang, L.,
Fragkiadaki, K.,
Reward Learning from Narrated Demonstrations,
CVPR18(7004-7013)
IEEE DOI
1812
Visualization, Natural languages, Detectors, Grounding,
Speech recognition, Microphones
BibRef
Cohen, N.[Niv],
Gal, R.[Rinon],
Meirom, E.A.[Eli A.],
Chechik, G.[Gal],
Atzmon, Y.[Yuval],
'This Is My Unicorn, Fluffy':
Personalizing Frozen Vision-Language Representations,
ECCV22(XX:558-577).
Springer DOI
2211
BibRef
Lee, J.H.[Ju-Hee],
Kang, J.W.[Je-Won],
Relation Enhanced Vision Language Pre-Training,
ICIP22(2286-2290)
IEEE DOI
2211
Visualization, Semantics, Force, Transformers, Task analysis,
vision-language pre-training
BibRef
Khan, Z.[Zaid],
Kumar, B.G.V.[B. G. Vijay],
Yu, X.[Xiang],
Schulter, S.[Samuel],
Chandraker, M.[Manmohan],
Fu, Y.[Yun],
Single-Stream Multi-level Alignment for Vision-Language Pretraining,
ECCV22(XXXVI:735-751).
Springer DOI
2211
BibRef
Wang, R.[Renhao],
Zhao, H.[Hang],
Gao, Y.[Yang],
CYBORGS: Contrastively Bootstrapping Object Representations by
Grounding in Segmentation,
ECCV22(XXXI:260-277).
Springer DOI
2211
BibRef
Yang, Z.Y.[Zheng-Yuan],
Gan, Z.[Zhe],
Wang, J.F.[Jian-Feng],
Hu, X.W.[Xiao-Wei],
Ahmed, F.[Faisal],
Liu, Z.C.[Zi-Cheng],
Lu, Y.[Yumao],
Wang, L.J.[Li-Juan],
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language
Modeling,
ECCV22(XXXVI:521-539).
Springer DOI
2211
BibRef
Li, H.[Huan],
Wei, P.[Ping],
Li, J.P.[Jia-Peng],
Ma, Z.[Zeyu],
Shang, J.[Jiahui],
Zheng, N.N.[Nan-Ning],
Asymmetric Relation Consistency Reasoning for Video Relation Grounding,
ECCV22(XXXV:125-141).
Springer DOI
2211
BibRef
Dvornik, N.[Nikita],
Hadji, I.[Isma],
Pham, H.[Hai],
Bhatt, D.[Dhaivat],
Martinez, B.[Brais],
Fazly, A.[Afsaneh],
Jepson, A.D.[Allan D.],
Flow Graph to Video Grounding for Weakly-Supervised Multi-step
Localization,
ECCV22(XXXV:319-335).
Springer DOI
2211
BibRef
Qu, M.X.[Meng-Xue],
Wu, Y.[Yu],
Liu, W.[Wu],
Gong, Q.Q.[Qi-Qi],
Liang, X.D.[Xiao-Dan],
Russakovsky, O.[Olga],
Zhao, Y.[Yao],
Wei, Y.C.[Yun-Chao],
SiRi: A Simple Selective Retraining Mechanism for Transformer-Based
Visual Grounding,
ECCV22(XXXV:546-562).
Springer DOI
2211
BibRef
Zhu, C.Y.[Chao-Yang],
Zhou, Y.[Yiyi],
Shen, Y.[Yunhang],
Luo, G.[Gen],
Pan, X.[Xingjia],
Chen, M.L.C.[Mingbao Lin. Chao],
Cao, L.J.[Liu-Juan],
Sun, X.S.[Xiao-Shuai],
Ji, R.R.[Rong-Rong],
SeqTR: A Simple Yet Universal Network for Visual Grounding,
ECCV22(XXXV:598-615).
Springer DOI
2211
BibRef
Khan, A.U.[Aisha Urooj],
Kuehne, H.[Hilde],
Gan, C.[Chuang],
da Vitoria Lobo, N.[Niels],
Shah, M.[Mubarak],
Weakly Supervised Grounding for VQA in Vision-Language Transformers,
ECCV22(XXXV:652-670).
Springer DOI
2211
BibRef
Hao, J.[Jiachang],
Sun, H.F.[Hai-Feng],
Ren, P.F.[Peng-Fei],
Wang, J.Y.[Jing-Yu],
Qi, Q.[Qi],
Liao, J.X.[Jian-Xin],
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training
Framework for Temporal Grounding,
ECCV22(XXXVI:130-147).
Springer DOI
2211
BibRef
Jain, A.[Ayush],
Gkanatsios, N.[Nikolaos],
Mediratta, I.[Ishita],
Fragkiadaki, K.[Katerina],
Bottom Up Top Down Detection Transformers for Language Grounding in
Images and Point Clouds,
ECCV22(XXXVI:417-433).
Springer DOI
2211
BibRef
Heisler, M.[Morgan],
Banitalebi-Dehkordi, A.[Amin],
Zhang, Y.[Yong],
SemAug: Semantically Meaningful Image Augmentations for Object
Detection Through Language Grounding,
ECCV22(XXXVI:610-626).
Springer DOI
2211
BibRef
Min, S.[Seonwoo],
Park, N.[Nokyung],
Kim, S.[Siwon],
Park, S.H.[Seung-Hyun],
Kim, J.[Jinkyu],
Grounding Visual Representations with Texts for Domain Generalization,
ECCV22(XXXVII:37-53).
Springer DOI
2211
BibRef
Wang, J.[Jia],
Wu, H.Y.[Hung-Yi],
Chen, J.C.[Jun-Cheng],
Shuai, H.H.[Hong-Han],
Cheng, W.H.[Wen-Huang],
Residual Graph Attention Network and Expression-Respect Data
Augmentation Aided Visual Grounding,
ICIP22(326-330)
IEEE DOI
2211
Visualization, Grounding, Training data, Cognition, Data models,
Complexity theory, Residual graph attention network, Visual grounding
BibRef
Xiong, Z.[Zeyu],
Liu, D.[Daizong],
Zhou, P.[Pan],
Gaussian Kernel-Based Cross Modal Network for Spatio-Temporal Video
Grounding,
ICIP22(2481-2485)
IEEE DOI
2211
Heating systems, Grounding, Natural languages, Electron tubes,
Task analysis, anchor-free, Gaussian kernel, spatial-temporal video grounding
BibRef
Alaniz, S.[Stephan],
Federici, M.[Marco],
Akata, Z.[Zeynep],
Compositional Mixture Representations for Vision and Text,
L3D-IVU22(4201-4210)
IEEE DOI
2210
Representation learning, Visualization, Computational modeling,
Semantics, Image retrieval, Employment, Object detection
BibRef
Cho, J.[Junhyeong],
Yoon, Y.[Youngseok],
Kwak, S.[Suha],
Collaborative Transformers for Grounded Situation Recognition,
CVPR22(19627-19636)
IEEE DOI
2210
Measurement, Training, Visualization, Computational modeling,
Estimation, Collaboration, Predictive models,
Visual reasoning
BibRef
Singh, A.[Amanpreet],
Hu, R.[Ronghang],
Goswami, V.[Vedanuj],
Couairon, G.[Guillaume],
Galuba, W.[Wojciech],
Rohrbach, M.[Marcus],
Kiela, D.[Douwe],
FLAVA: A Foundational Language And Vision Alignment Model,
CVPR22(15617-15629)
IEEE DOI
2210
Analytical models, Computational modeling, Pattern recognition,
Task analysis, Vision+language
BibRef
Saini, N.[Nirat],
Pham, K.[Khoi],
Shrivastava, A.[Abhinav],
Disentangling Visual Embeddings for Attributes and Objects,
CVPR22(13648-13657)
IEEE DOI
2210
WWW Link. Visualization, Codes, Benchmark testing,
Linguistics, Feature extraction, Recognition: detection, Visual reasoning
BibRef
Ge, Y.Y.[Yu-Ying],
Ge, Y.X.[Yi-Xiao],
Liu, X.H.[Xi-Hui],
Wang, J.P.[Jin-Peng],
Wu, J.P.[Jian-Ping],
Shan, Y.[Ying],
Qie, X.[Xiaohu],
Luo, P.[Ping],
MILES: Visual BERT Pre-training with Injected Language Semantics for
Video-Text Retrieval,
ECCV22(XXXV:691-708).
Springer DOI
2211
BibRef
Wang, A.J.P.[Alex Jin-Peng],
Ge, Y.X.[Yi-Xiao],
Cai, G.[Guanyu],
Yan, R.[Rui],
Lin, X.D.[Xu-Dong],
Shan, Y.[Ying],
Qie, X.[Xiaohu],
Shou, M.Z.[Mike Zheng],
Object-aware Video-language Pre-training for Retrieval,
CVPR22(3303-3312)
IEEE DOI
2210
Training, Visualization, Machine vision, Semantics,
Detectors, Transformers, retrieval
BibRef
Li, D.X.[Dong-Xu],
Li, J.N.[Jun-Nan],
Li, H.D.[Hong-Dong],
Niebles, J.C.[Juan Carlos],
Hoi, S.C.H.[Steven C.H.],
Align and Prompt: Video-and-Language Pre-training with Entity Prompts,
CVPR22(4943-4953)
IEEE DOI
2210
Representation learning, Vocabulary, Visualization, Semantics,
Detectors, Transformers, Pattern recognition, Vision + language,
Video analysis and understanding
BibRef
Xue, H.W.[Hong-Wei],
Hang, T.[Tiankai],
Zeng, Y.H.[Yan-Hong],
Sun, Y.C.[Yu-Chong],
Liu, B.[Bei],
Yang, H.[Huan],
Fu, J.L.[Jian-Long],
Guo, B.N.[Bai-Ning],
Advancing High-Resolution Video-Language Representation with
Large-Scale Video Transcriptions,
CVPR22(5026-5035)
IEEE DOI
2210
Visualization, Video on demand, Computational modeling,
Superresolution, Semantics, Transformers, Feature extraction,
Self- semi- meta- unsupervised learning
BibRef
Sammani, F.[Fawaz],
Mukherjee, T.[Tanmoy],
Deligiannis, N.[Nikos],
NLX-GPT: A Model for Natural Language Explanations in Vision and
Vision-Language Tasks,
CVPR22(8312-8322)
IEEE DOI
2210
Current measurement, Computational modeling, Natural languages,
Decision making, Memory management, Predictive models,
Vision + language
BibRef
Lin, B.Q.[Bing-Qian],
Zhu, Y.[Yi],
Chen, Z.C.[Zi-Cong],
Liang, X.[Xiwen],
Liu, J.Z.[Jian-Zhuang],
Liang, X.D.[Xiao-Dan],
ADAPT: Vision-Language Navigation with Modality-Aligned Action
Prompts,
CVPR22(15375-15385)
IEEE DOI
2210
Visualization, Adaptation models, Navigation, Transformers,
Nonhomogeneous media, Pattern recognition, Vision + language
BibRef
Dou, Z.Y.[Zi-Yi],
Xu, Y.C.[Yi-Chong],
Gan, Z.[Zhe],
Wang, J.F.[Jian-Feng],
Wang, S.H.[Shuo-Hang],
Wang, L.J.[Li-Juan],
Zhu, C.G.[Chen-Guang],
Zhang, P.C.[Peng-Chuan],
Yuan, L.[Lu],
Peng, N.[Nanyun],
Liu, Z.C.[Zi-Cheng],
Zeng, M.[Michael],
An Empirical Study of Training End-to-End Vision-and-Language
Transformers,
CVPR22(18145-18155)
IEEE DOI
2210
Meters, Training, Codes, Computational modeling, Transformers,
Pattern recognition, Vision + language, Machine learning
BibRef
Xu, Z.P.[Zi-Peng],
Lin, T.W.[Tian-Wei],
Tang, H.[Hao],
Li, F.[Fu],
He, D.L.[Dong-Liang],
Sebe, N.[Nicu],
Timofte, R.[Radu],
Van Gool, L.J.[Luc J.],
Ding, E.[Errui],
Predict, Prevent, and Evaluate: Disentangled Text-Driven Image
Manipulation Empowered by Pre-Trained Vision-Language Model,
CVPR22(18208-18217)
IEEE DOI
2210
Personal protective equipment, Measurement, Training, Annotations,
Face recognition, Computational modeling, Face and gestures
BibRef
Du, Y.[Yu],
Wei, F.Y.[Fang-Yun],
Zhang, Z.[Zihe],
Shi, M.J.[Miao-Jing],
Gao, Y.[Yue],
Li, G.Q.[Guo-Qi],
Learning to Prompt for Open-Vocabulary Object Detection with
Vision-Language Model,
CVPR22(14064-14073)
IEEE DOI
2210
Training, Representation learning, Visualization,
Transfer learning, Object detection, Detectors,
Self- semi- meta- unsupervised learning
BibRef
Chang, Y.S.[Ying-Shan],
Cao, G.H.[Gui-Hong],
Narang, M.[Mridu],
Gao, J.F.[Jian-Feng],
Suzuki, H.[Hisami],
Bisk, Y.[Yonatan],
WebQA: Multihop and Multimodal QA,
CVPR22(16474-16483)
IEEE DOI
2210
Knowledge engineering, Representation learning, Visualization,
Transformers, Cognition,
Visual reasoning
BibRef
Zellers, R.[Rowan],
Lu, J.[Jiasen],
Lu, X.[Ximing],
Yu, Y.[Youngjae],
Zhao, Y.P.[Yan-Peng],
Salehi, M.[Mohammadreza],
Kusupati, A.[Aditya],
Hessel, J.[Jack],
Farhadi, A.[Ali],
Choi, Y.[Yejin],
MERLOT RESERVE:
Neural Script Knowledge through Vision and Language and Sound,
CVPR22(16354-16366)
IEEE DOI
2210
Training, Representation learning, Visualization, Ethics,
Video on demand, Navigation, Stars, Vision + language, Visual reasoning
BibRef
Gupta, T.[Tanmay],
Kamath, A.[Amita],
Kembhavi, A.[Aniruddha],
Hoiem, D.[Derek],
Towards General Purpose Vision Systems:
An End-to-End Task-Agnostic Vision-Language Architecture,
CVPR22(16378-16388)
IEEE DOI
2210
Training, Visualization, Machine vision,
Object detection, Network architecture, Vision + language
BibRef
Materzynska, J.[Joanna],
Torralba, A.[Antonio],
Bau, D.[David],
Disentangling visual and written concepts in CLIP,
CVPR22(16389-16398)
IEEE DOI
2210
Visualization, Image coding, Benchmark testing, Cognition,
Pattern recognition, Task analysis, Vision + language, Visual reasoning
BibRef
Li, M.[Manling],
Xu, R.[Ruochen],
Wang, S.[Shuohang],
Zhou, L.[Luowei],
Lin, X.D.[Xu-Dong],
Zhu, C.G.[Chen-Guang],
Zeng, M.[Michael],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
CLIP-Event: Connecting Text and Images with Event Structures,
CVPR22(16399-16408)
IEEE DOI
2210
Codes, Computational modeling, Image retrieval, Benchmark testing,
Information retrieval, Pattern recognition, Vision + language
BibRef
Surís, D.[Dídac],
Epstein, D.[Dave],
Vondrick, C.[Carl],
Globetrotter: Connecting Languages by Connecting Images,
CVPR22(16453-16463)
IEEE DOI
2210
Training, Deep learning, Visualization, Image segmentation, Codes,
Computational modeling, Vision + language
BibRef
Zhu, H.D.[Hai-Dong],
Sadhu, A.[Arka],
Zheng, Z.H.[Zhao-Heng],
Nevatia, R.[Ram],
Utilizing Every Image Object for Semi-supervised Phrase Grounding,
WACV21(2209-2218)
IEEE DOI
2106
Localize an object in the image given a referring expression.
Training, Grounding, Annotations,
Detectors, Task analysis
BibRef
Zhong, Y.[Yiwu],
Yang, J.W.[Jian-Wei],
Zhang, P.[Pengchuan],
Li, C.Y.[Chun-Yuan],
Codella, N.[Noel],
Li, L.H.[Liunian Harold],
Zhou, L.[Luowei],
Dai, X.[Xiyang],
Yuan, L.[Lu],
Li, Y.[Yin],
Gao, J.F.[Jian-Feng],
RegionCLIP: Region-based Language-Image Pretraining,
CVPR22(16772-16782)
IEEE DOI
2210
Representation learning, Visualization, Technological innovation,
Image recognition, Text recognition, Transfer learning, Vision + language
BibRef
Sung, Y.L.[Yi-Lin],
Cho, J.[Jaemin],
Bansal, M.[Mohit],
VL-ADAPTER: Parameter-Efficient Transfer Learning for
Vision-and-Language Tasks,
CVPR22(5217-5227)
IEEE DOI
2210
Training, Adaptation models, Computational modeling,
Transfer learning, Benchmark testing, Multitasking, Vision + language
BibRef
Wu, D.M.[Dong-Ming],
Dong, X.P.[Xing-Ping],
Shao, L.[Ling],
Shen, J.B.[Jian-Bing],
Multi-Level Representation Learning with Semantic Alignment for
Referring Video Object Segmentation,
CVPR22(4986-4995)
IEEE DOI
2210
Representation learning, Visualization, Adaptation models, Shape,
Grounding, Semantics, Vision + language, Segmentation,
grouping and shape analysis
BibRef
Gao, K.[Kaifeng],
Chen, L.[Long],
Niu, Y.[Yulei],
Shao, J.[Jian],
Xiao, J.[Jun],
Classification-Then-Grounding: Reformulating Video Scene Graphs as
Temporal Bipartite Graphs,
CVPR22(19475-19484)
IEEE DOI
2210
Image analysis, Codes, Grounding, Semantics, Bipartite graph,
Pattern recognition, Scene analysis and understanding,
Vision + language
BibRef
Kesen, I.[Ilker],
Can, O.A.[Ozan Arkan],
Erdem, E.[Erkut],
Erdem, A.[Aykut],
Yüret, D.[Deniz],
Modulating Bottom-Up and Top-Down Visual Processing via
Language-Conditional Filters,
MULA22(4609-4619)
IEEE DOI
2210
Visualization, Image segmentation, Image color analysis, Grounding,
Computational modeling, Process control, Predictive models
BibRef
Nebbia, G.[Giacomo],
Kovashka, A.[Adriana],
Doubling down: sparse grounding with an additional, almost-matching
caption for detection-oriented multimodal pretraining,
MULA22(4641-4650)
IEEE DOI
2210
Deep learning, Visualization, Grounding,
Computational modeling, Data models
BibRef
Ye, J.[Jiabo],
Tian, J.F.[Jun-Feng],
Yan, M.[Ming],
Yang, X.S.[Xiao-Shan],
Wang, X.[Xuwu],
Zhang, J.[Ji],
He, L.[Liang],
Lin, X.[Xin],
Shifting More Attention to Visual Backbone: Query-modulated
Refinement Networks for End-to-End Visual Grounding,
CVPR22(15481-15491)
IEEE DOI
2210
Training, Visualization, Grounding, Refining, Natural languages,
Feature extraction, Vision + language, Visual reasoning
BibRef
Jiang, H.J.[Hao-Jun],
Lin, Y.Z.[Yuan-Ze],
Han, D.C.[Dong-Chen],
Song, S.[Shiji],
Huang, G.[Gao],
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding,
CVPR22(15492-15502)
IEEE DOI
2210
Training, Visualization, Costs, Grounding, Annotations,
Computational modeling, Natural languages, Vision + language, Visual reasoning
BibRef
Huang, S.[Shijia],
Chen, Y.L.[Yi-Lun],
Jia, J.Y.[Jia-Ya],
Wang, L.W.[Li-Wei],
Multi-View Transformer for 3D Visual Grounding,
CVPR22(15503-15512)
IEEE DOI
2210
Point cloud compression, Visualization, Solid modeling, Grounding,
Natural languages, Vision + language
BibRef
Chen, S.[Sijia],
Li, B.[Baochun],
Multi-Modal Dynamic Graph Transformer for Visual Grounding,
CVPR22(15513-15522)
IEEE DOI
2210
Visualization, Image analysis, Grounding, Computational modeling,
Semantics, Natural languages, Vision + language,
Scene analysis and understanding
BibRef
Mavroudi, E.[Effrosyni],
Vidal, R.[René],
Weakly-Supervised Generation and Grounding of Visual Descriptions
with Conditional Generative Models,
CVPR22(15523-15533)
IEEE DOI
2210
Visualization, Grounding, Video description,
Computational modeling, Random variables, Pattern recognition,
Video analysis and understanding
BibRef
Chen, S.[Shi],
Zhao, Q.[Qi],
REX: Reasoning-aware and Grounded Explanation,
CVPR22(15565-15574)
IEEE DOI
2210
Visualization, Codes, Grounding, Transfer learning, Decision making,
Multitasking, Vision + language, Visual reasoning
BibRef
Lou, C.[Chao],
Han, W.J.[Wen-Juan],
Lin, Y.[Yuhuan],
Zheng, Z.L.[Zi-Long],
Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual
Scene Graphs with Language Structures via Dependency Relationships,
CVPR22(15586-15595)
IEEE DOI
2210
Visualization, Grounding, Buildings, Benchmark testing, Linguistics,
Pattern recognition, Vision + language, Explainable computer vision
BibRef
Yang, A.[Antoine],
Miech, A.[Antoine],
Sivic, J.[Josef],
Laptev, I.[Ivan],
Schmid, C.[Cordelia],
TubeDETR: Spatio-Temporal Video Grounding with Transformers,
CVPR22(16421-16432)
IEEE DOI
2210
Location awareness, Grounding, Natural languages,
Object detection, Benchmark testing,
Vision + language
BibRef
Luo, J.Y.[Jun-Yu],
Fu, J.[Jiahui],
Kong, X.[Xianghao],
Gao, C.[Chen],
Ren, H.B.[Hai-Bing],
Shen, H.[Hao],
Xia, H.X.[Hua-Xia],
Liu, S.[Si],
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point
Progressive Selection,
CVPR22(16433-16442)
IEEE DOI
2210
Point cloud compression, Visualization, Solid modeling, Grounding,
Detectors, Vision + language
BibRef
Cai, D.[Daigang],
Zhao, L.C.[Li-Chen],
Zhang, J.[Jing],
Sheng, L.[Lu],
Xu, D.[Dong],
3DJCG: A Unified Framework for Joint Dense Captioning and Visual
Grounding on 3D Point Clouds,
CVPR22(16443-16452)
IEEE DOI
2210
Training, Point cloud compression, Visualization, Grounding,
Performance gain, retrieval, categorization, Vision + language,
Recognition: detection
BibRef
Luo, H.C.[Hong-Chen],
Zhai, W.[Wei],
Zhang, J.[Jing],
Cao, Y.[Yang],
Tao, D.C.[Da-Cheng],
Learning Affordance Grounding from Exocentric Images,
CVPR22(2242-2251)
IEEE DOI
2210
Analytical models, Visualization, Grounding, Affordances,
Computational modeling, Transforms, Feature extraction,
Scene analysis and understanding
BibRef
Jiang, X.[Xun],
Xu, X.[Xing],
Zhang, J.[Jingran],
Shen, F.M.[Fu-Min],
Cao, Z.[Zuo],
Shen, H.T.[Heng Tao],
Semi-supervised Video Paragraph Grounding with Contrastive Encoder,
CVPR22(2456-2465)
IEEE DOI
2210
Training, Grounding, Annotations, Training data,
Semisupervised learning, Transformers, Data models,
Vision + language
BibRef
Li, J.C.[Jun-Cheng],
Xie, J.L.[Jun-Lin],
Qian, L.[Long],
Zhu, L.C.[Lin-Chao],
Tang, S.L.[Si-Liang],
Wu, F.[Fei],
Yang, Y.[Yi],
Zhuang, Y.T.[Yue-Ting],
Wang, X.E.[Xin Eric],
Compositional Temporal Grounding with Structured Variational
Cross-Graph Correspondence Learning,
CVPR22(3022-3031)
IEEE DOI
2210
Grounding, Current measurement, Computational modeling, Semantics,
Diversity reception, Linguistics,
Vision + language
BibRef
Yu, W.[Wei],
Chen, W.X.[Wen-Xin],
Yin, S.[Songheng],
Easterbrook, S.[Steve],
Garg, A.[Animesh],
Modular Action Concept Grounding in Semantic Video Prediction,
CVPR22(3595-3604)
IEEE DOI
2210
Adaptation models, Visualization, Inverse problems, Grounding,
Semantics, Object detection, Predictive models,
Vision + language
BibRef
Soldan, M.[Mattia],
Pardo, A.[Alejandro],
Alcázar, J.L.[Juan León],
Heilbron, F.C.[Fabian Caba],
Zhao, C.[Chen],
Giancola, S.[Silvio],
Ghanem, B.[Bernard],
MAD: A Scalable Dataset for Language Grounding in Videos from Movie
Audio Descriptions,
CVPR22(5016-5025)
IEEE DOI
2210
Grounding, Annotations, Pipelines, Natural languages,
Machine learning, Benchmark testing, Vision + language,
Video analysis and understanding
BibRef
Yang, L.[Li],
Xu, Y.[Yan],
Yuan, C.F.[Chun-Feng],
Liu, W.[Wei],
Li, B.[Bing],
Hu, W.M.[Wei-Ming],
Improving Visual Grounding with Visual-Linguistic Verification and
Iterative Reasoning,
CVPR22(9489-9498)
IEEE DOI
2210
Location awareness, Visualization, Grounding, Natural languages,
Object detection, Transformers, Cognition, Recognition: detection, retrieval
BibRef
Li, L.H.[Liunian Harold],
Zhang, P.C.[Peng-Chuan],
Zhang, H.T.[Hao-Tian],
Yang, J.W.[Jian-Wei],
Li, C.Y.[Chun-Yuan],
Zhong, Y.[Yiwu],
Wang, L.J.[Li-Juan],
Yuan, L.[Lu],
Zhang, L.[Lei],
Hwang, J.N.[Jenq-Neng],
Chang, K.W.[Kai-Wei],
Gao, J.F.[Jian-Feng],
Grounded Language-Image Pre-training,
CVPR22(10955-10965)
IEEE DOI
2210
Visualization, Image recognition, Head, Grounding, Object detection,
Data models, Deep learning architectures and techniques,
Vision + language
BibRef
Li, Y.C.[Yi-Cong],
Wang, X.[Xiang],
Xiao, J.B.[Jun-Bin],
Ji, W.[Wei],
Chua, T.S.[Tat-Seng],
Invariant Grounding for Video Question Answering,
CVPR22(2918-2927)
IEEE DOI
2210
Visualization, Correlation, Grounding, Semantics, Predictive models,
Linguistics, Question answering (information retrieval),
Vision + language
BibRef
Yang, Z.Y.[Zheng-Yuan],
Zhang, S.Y.[Song-Yang],
Wang, L.W.[Li-Wei],
Luo, J.B.[Jie-Bo],
SAT: 2D Semantics Assisted Training for 3D Visual Grounding,
ICCV21(1836-1846)
IEEE DOI
2203
Training, Point cloud compression, Representation learning,
Visualization, Grounding, Semantics, Vision + language,
BibRef
Chen, J.W.[Jun-Wen],
Golisano, Y.K.[Yu Kong],
Explainable Video Entailment with Grounded Visual Evidence,
ICCV21(2001-2010)
IEEE DOI
2203
Training, Visualization, Grounding, Computational modeling,
Decision making, Focusing, Vision + language, Video analysis and understanding
BibRef
Zhao, L.C.[Li-Chen],
Cai, D.[Daigang],
Sheng, L.[Lu],
Xu, D.[Dong],
3DVG-Transformer: Relation Modeling for Visual Grounding on Point
Clouds,
ICCV21(2908-2917)
IEEE DOI
2203
Point cloud compression, Multiplexing, Visualization,
Solid modeling, Grounding, Transformers,
Vision + language
BibRef
Feng, M.[Mingtao],
Li, Z.[Zhen],
Li, Q.[Qi],
Zhang, L.[Liang],
Zhang, X.[XiangDong],
Zhu, G.M.[Guang-Ming],
Zhang, H.[Hui],
Wang, Y.[Yaonan],
Mian, A.[Ajmal],
Free-form Description Guided 3D Visual Graph Network for Object
Grounding in Point Cloud,
ICCV21(3702-3711)
IEEE DOI
2203
Point cloud compression, Visualization, Correlation, Grounding,
Natural languages, Detection and localization in 2D and 3D,
Visual reasoning and logical representation
BibRef
Ding, X.P.[Xin-Peng],
Wang, N.N.[Nan-Nan],
Zhang, S.[Shiwei],
Cheng, D.[De],
Li, X.M.[Xiao-Meng],
Huang, Z.Y.[Zi-Yuan],
Tang, M.Q.[Ming-Qian],
Gao, X.B.[Xin-Bo],
Support-Set Based Cross-Supervision for Video Grounding,
ICCV21(11553-11562)
IEEE DOI
2203
Training, Visualization, Costs, Correlation, Grounding, Semantics,
Image and video retrieval, Vision + language
BibRef
Khandelwal, S.[Siddhesh],
Suhail, M.[Mohammed],
Sigal, L.[Leonid],
Segmentation-grounded Scene Graph Generation,
ICCV21(15859-15869)
IEEE DOI
2203
Image segmentation, Visualization, Grounding, Annotations, Genomics,
Scene analysis and understanding,
Transfer/Low-shot/Semi/Unsupervised Learning
BibRef
Patel, S.[Shivansh],
Wani, S.[Saim],
Jain, U.[Unnat],
Schwing, A.[Alexander],
Lazebnik, S.[Svetlana],
Savva, M.[Manolis],
Chang, A.X.[Angel X.],
Interpretation of Emergent Communication in Heterogeneous
Collaborative Embodied Agents,
ICCV21(15993-15943)
IEEE DOI
2203
Systematics, Navigation, Grounding, Collaboration, Task analysis,
Vision for robotics and autonomous vehicles, Explainable AI,
Visual reasoning and logical representation
BibRef
Shi, J.[Jing],
Zhong, Y.[Yiwu],
Xu, N.[Ning],
Li, Y.[Yin],
Xu, C.L.[Chen-Liang],
A Simple Baseline for Weakly-Supervised Scene Graph Generation,
ICCV21(16373-16382)
IEEE DOI
2203
Visualization, Grounding, Computational modeling, Pipelines,
Genomics, Complexity theory, Scene analysis and understanding, Vision + language
BibRef
Su, R.[Rui],
Yu, Q.[Qian],
Xu, D.[Dong],
STVGBert: A Visual-linguistic Transformer based Framework for
Spatio-temporal Video Grounding,
ICCV21(1513-1522)
IEEE DOI
2203
Representation learning, Visualization, Grounding, Detectors,
Benchmark testing, Transformers, Electron tubes,
Vision + language, Video analysis and understanding
BibRef
Cui, C.Y.Q.[Claire Yu-Qing],
Khandelwal, A.[Apoorv],
Artzi, Y.[Yoav],
Snavely, N.[Noah],
Averbuch-Elor, H.[Hadar],
Who's Waldo? Linking People Across Text and Images,
ICCV21(1354-1364)
IEEE DOI
2203
Visualization, Codes, Grounding, Force, Benchmark testing,
Transformers, Vision + language, Datasets and evaluation
BibRef
González, C.[Cristina],
Ayobi, N.[Nicolás],
Hernández, I.[Isabela],
Hernández, J.[José],
Pont-Tuset, J.[Jordi],
Arbeláez, P.[Pablo],
Panoptic Narrative Grounding,
ICCV21(1344-1353)
IEEE DOI
2203
Measurement, Visualization, Image segmentation, Grounding,
Annotations, Semantics, Vision + language, grouping and shape
BibRef
Hong, Y.[Yining],
Li, Q.[Qing],
Zhu, S.C.[Song-Chun],
Huang, S.Y.[Si-Yuan],
VLGrammar: Grounded Grammar Induction of Vision and Language,
ICCV21(1645-1654)
IEEE DOI
2203
Visualization, Semantics, Natural languages, Image retrieval,
Probabilistic logic, Vision + language,
BibRef
Kamath, A.[Aishwarya],
Singh, M.[Mannat],
Le Cun, Y.[Yann],
Synnaeve, G.[Gabriel],
Misra, I.[Ishan],
Carion, N.[Nicolas],
MDETR: Modulated Detection for End-to-End Multi-Modal Understanding,
ICCV21(1760-1770)
IEEE DOI
2203
Visualization, Vocabulary, Image segmentation, Grounding, Detectors,
Vision + language,
Visual reasoning and logical representation
BibRef
Yuan, Z.H.[Zhi-Hao],
Yan, X.[Xu],
Liao, Y.H.[Ying-Hong],
Zhang, R.[Ruimao],
Wang, S.[Sheng],
Li, Z.[Zhen],
Cui, S.G.[Shu-Guang],
InstanceRefer: Cooperative Holistic Understanding for Visual
Grounding on Point Clouds through Instance Multi-level Contextual
Referring,
ICCV21(1771-1780)
IEEE DOI
2203
Location awareness, Point cloud compression, Visualization,
Solid modeling, Grounding, Predictive models, Vision + language,
Visual reasoning and logical representation
BibRef
Deng, J.J.[Jia-Jun],
Yang, Z.Y.[Zheng-Yuan],
Chen, T.L.[Tian-Lang],
Zhou, W.G.[Wen-Gang],
Li, H.Q.[Hou-Qiang],
TransVG: End-to-End Visual Grounding with Transformers,
ICCV21(1749-1759)
IEEE DOI
2203
Visualization, Codes, Grounding, Manuals, Transformers, Cognition,
Vision + language, Vision + other modalities
BibRef
Soldan, M.[Mattia],
Xu, M.M.[Meng-Meng],
Qu, S.[Sisi],
Tegner, J.[Jesper],
Ghanem, B.[Bernard],
VLG-Net: Video-Language Graph Matching Network for Video Grounding,
CVEU21(3217-3227)
IEEE DOI
2112
Location awareness, Grounding,
Semantics, Syntactics, Graph neural networks
BibRef
Lu, X.P.[Xiao-Peng],
Fan, Z.[Zhen],
Wang, Y.[Yansen],
Oh, J.[Jean],
Rosé, C.P.[Carolyn P.],
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling,
XSAnim21(2631-2639)
IEEE DOI
2112
Integrated optics, Visualization, Grounding,
Computational modeling, Knowledge discovery
BibRef
Song, S.[Sijie],
Lin, X.D.[Xu-Dong],
Liu, J.Y.[Jia-Ying],
Guo, Z.M.[Zong-Ming],
Chang, S.F.[Shih-Fu],
Co-Grounding Networks with Semantic Attention for Referring
Expression Comprehension in Videos,
CVPR21(1346-1355)
IEEE DOI
2111
Visualization, Correlation, Grounding,
Computational modeling, Semantics, Benchmark testing
BibRef
Tian, Y.P.[Ya-Peng],
Hu, D.[Di],
Xu, C.L.[Chen-Liang],
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound
Separation,
CVPR21(2744-2753)
IEEE DOI
2111
Training, Visualization, Codes, Grounding,
Computational modeling, Pattern recognition
BibRef
Nan, G.S.[Guo-Shun],
Qiao, R.[Rui],
Xiao, Y.[Yao],
Liu, J.[Jun],
Leng, S.C.[Si-Cong],
Zhang, H.[Hao],
Lu, W.[Wei],
Interventional Video Grounding with Dual Contrastive Learning,
CVPR21(2764-2774)
IEEE DOI
2111
Visualization, Correlation, Grounding, Benchmark testing,
Knowledge discovery, Data models, Pattern recognition
BibRef
Zhao, Y.[Yang],
Zhao, Z.[Zhou],
Zhang, Z.[Zhu],
Lin, Z.J.[Zhi-Jie],
Cascaded Prediction Network via Segment Tree for Temporal Video
Grounding,
CVPR21(4195-4204)
IEEE DOI
2111
Costs, Grounding, Navigation, Fuses,
Benchmark testing, Pattern recognition
BibRef
Liu, Y.F.[Yong-Fei],
Wan, B.[Bo],
Ma, L.[Lin],
He, X.M.[Xu-Ming],
Relation-aware Instance Refinement for Weakly Supervised Visual
Grounding,
CVPR21(5608-5617)
IEEE DOI
2111
Location awareness, Learning systems, Visualization, Grounding,
Semantics, Noise reduction, Benchmark testing
BibRef
Liu, H.L.[Hao-Lin],
Lin, A.[Anran],
Han, X.G.[Xiao-Guang],
Yang, L.[Lei],
Yu, Y.Z.[Yi-Zhou],
Cui, S.G.[Shu-Guang],
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in
RGBD Images,
CVPR21(6028-6037)
IEEE DOI
2111
Heating systems, Geometry, Visualization,
Grounding, Fuses, Feature extraction
BibRef
Lin, X.R.[Xiang-Ru],
Li, G.B.[Guan-Bin],
Yu, Y.Z.[Yi-Zhou],
Scene-Intuitive Agent for Remote Embodied Visual Grounding,
CVPR21(7032-7041)
IEEE DOI
2111
Training, Visualization, Grounding, Navigation, Fuses, Semantics, Pipelines
BibRef
Liu, D.Z.[Dai-Zong],
Qu, X.Y.[Xiao-Ye],
Dong, J.F.[Jian-Feng],
Zhou, P.[Pan],
Cheng, Y.[Yu],
Wei, W.[Wei],
Xu, Z.[Zichuan],
Xie, Y.[Yulai],
Context-aware Biaffine Localizing Network for Temporal Sentence
Grounding,
CVPR21(11230-11239)
IEEE DOI
2111
Location awareness, Codes, Grounding, Cognition,
Pattern recognition, Task analysis
BibRef
Meng, Z.[Zihang],
Yu, L.C.[Li-Cheng],
Zhang, N.[Ning],
Berg, T.[Tamara],
Damavandi, B.[Babak],
Singh, V.[Vikas],
Bearman, A.[Amy],
Connecting What to Say With Where to Look by Modeling Human Attention
Traces,
CVPR21(12674-12683)
IEEE DOI
2111
Measurement, Visualization, Grounding, Unified modeling language,
Training data, Transformers
BibRef
Sun, M.J.[Ming-Jie],
Xiao, J.[Jimin],
Lim, E.G.[Eng Gee],
Iterative Shrinking for Referring Expression Grounding Using Deep
Reinforcement Learning,
CVPR21(14055-14064)
IEEE DOI
2111
Art, Grounding, Reinforcement learning, Cognition,
Pattern recognition, Proposals
BibRef
Wang, L.W.[Li-Wei],
Huang, J.[Jing],
Li, Y.[Yin],
Xu, K.[Kun],
Yang, Z.Y.[Zheng-Yuan],
Yu, D.[Dong],
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge
Distillation,
CVPR21(14085-14095)
IEEE DOI
2111
Training, Visualization, Technological innovation,
Costs, Grounding, Detectors
BibRef
Huang, B.B.[Bin-Bin],
Lian, D.Z.[Dong-Ze],
Luo, W.X.[Wei-Xin],
Gao, S.H.[Sheng-Hua],
Look Before You Leap:
Learning Landmark Features for One-Stage Visual Grounding,
CVPR21(16883-16892)
IEEE DOI
2111
Visualization, Grounding, Convolution,
Heuristic algorithms, Computational modeling, Linguistics
BibRef
Zhou, H.[Hao],
Zhang, C.Y.[Chong-Yang],
Luo, Y.[Yan],
Chen, Y.J.[Yan-Jun],
Hu, C.P.[Chuan-Ping],
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal
Grounding,
CVPR21(8441-8450)
IEEE DOI
2111
Performance evaluation, Uncertainty, Grounding,
Annotations, Feature extraction, Robustness
BibRef
Khan, A.U.[Aisha Urooj],
Kuehne, H.[Hilde],
Duarte, K.[Kevin],
Gan, C.[Chuang],
Lobo, N.[Niels],
Shah, M.[Mubarak],
Found a Reason for me? Weakly-supervised Grounded Visual Question
Answering using Capsules,
CVPR21(8461-8470)
IEEE DOI
2111
Training, Visualization, Vocabulary, Grounding, Focusing, Detectors,
Knowledge discovery
BibRef
Zhang, S.Y.[Sheng-Yu],
Jiang, T.[Tan],
Wang, T.[Tan],
Kuang, K.[Kun],
Zhao, Z.[Zhou],
Zhu, J.[Jianke],
Yu, J.[Jin],
Yang, H.X.[Hong-Xia],
Wu, F.[Fei],
DeVLBert: Out-of-distribution Visio-Linguistic Pretraining with
Causality,
CiV21(1744-1747)
IEEE DOI
2109
Visualization, Correlation,
Image retrieval, Knowledge discovery
BibRef
Nguyen, A.T.[Andre T.],
Richards, L.E.[Luke E.],
Kebe, G.Y.[Gaoussou Youssouf],
Raff, E.[Edward],
Darvish, K.[Kasra],
Ferraro, F.[Frank],
Matuszek, C.[Cynthia],
Practical Cross-modal Manifold Alignment for Robotic Grounded
Language Learning,
MULA21(1613-1622)
IEEE DOI
2109
Manifolds, Measurement, Learning systems,
Natural languages, Robot sensing systems
BibRef
Shrestha, A.[Amar],
Pugdeethosapol, K.[Krittaphat],
Fang, H.[Haowen],
Qiu, Q.[Qinru],
MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language
Queries at Phrase Level,
ICPR21(8275-8282)
IEEE DOI
2105
Visualization, Grounding, Fuses, Magnetic resonance imaging,
Natural languages, Games, Pattern recognition
BibRef
Zhang, Z.,
Zhao, Z.,
Zhao, Y.,
Wang, Q.,
Liu, H.,
Gao, L.,
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form
Sentences,
CVPR20(10665-10674)
IEEE DOI
2008
Grounding, Task analysis, Visualization, Cognition,
Feature extraction, Natural languages
BibRef
Sadhu, A.[Arka],
Chen, K.[Kan],
Nevatia, R.[Ram],
Video Object Grounding Using Semantic Roles in Language Description,
CVPR20(10414-10424)
IEEE DOI
2008
grounds objects in videos referred to in natural language descriptions.
Semantics, Encoding, Proposals, Grounding, Visualization,
Task analysis, Feature extraction
BibRef
Ma, C.Y.[Chih-Yao],
Kalantidis, Y.[Yannis],
AlRegib, G.[Ghassan],
Vajda, P.[Peter],
Rohrbach, M.[Marcus],
Kira, Z.[Zsolt],
Learning to Generate Grounded Visual Captions Without Localization
Supervision,
ECCV20(XVIII:353-370).
Springer DOI
2012
BibRef
Gouthaman, K.V.,
Mittal, A.[Anurag],
Reducing Language Biases in Visual Question Answering with
Visually-grounded Question Encoder,
ECCV20(XIII:18-34).
Springer DOI
2011
BibRef
Zeng, R.H.[Run-Hao],
Xu, H.M.[Hao-Ming],
Huang, W.B.[Wen-Bing],
Chen, P.H.[Pei-Hao],
Tan, M.K.[Ming-Kui],
Gan, C.[Chuang],
Dense Regression Network for Video Grounding,
CVPR20(10284-10293)
IEEE DOI
2008
Grounding, Training, Task analysis, Proposals, Semantics,
Magnetic heads, Feature extraction
BibRef
Gupta, T.[Tanmay],
Vahdat, A.[Arash],
Chechik, G.[Gal],
Yang, X.D.[Xiao-Dong],
Kautz, J.[Jan],
Hoiem, D.[Derek],
Contrastive Learning for Weakly Supervised Phrase Grounding,
ECCV20(III:752-768).
Springer DOI
2012
BibRef
Tan, H.L.,
Leong, M.C.,
Xu, Q.,
Li, L.,
Fang, F.,
Cheng, Y.,
Gauthier, N.,
Sun, Y.,
Lim, J.H.,
Task-Oriented Multi-Modal Question Answering For Collaborative
Applications,
ICIP20(1426-1430)
IEEE DOI
2011
Task analysis, Collaboration, Grounding, Visualization, Cognition,
Training, Machine learning, question answering,
corpora
BibRef
Yang, S.[Sibei],
Li, G.B.[Guan-Bin],
Yu, Y.Z.[Yi-Zhou],
Propagating Over Phrase Relations for One-stage Visual Grounding,
ECCV20(XIX:589-605).
Springer DOI
2011
BibRef
Xiao, J.B.[Jun-Bin],
Shang, X.[Xindi],
Yang, X.[Xun],
Tang, S.[Sheng],
Chua, T.S.[Tat-Seng],
Visual Relation Grounding in Videos,
ECCV20(VI:447-464).
Springer DOI
2011
Code, Relations.
WWW Link.
BibRef
Mun, J.,
Cho, M.,
Han, B.,
Local-Global Video-Text Interactions for Temporal Grounding,
CVPR20(10807-10816)
IEEE DOI
2008
Semantics, Feature extraction, Grounding, Visualization, Proposals,
Task analysis, Context modeling
BibRef
Wu, C.,
Lin, Z.,
Cohen, S.,
Bui, T.,
Maji, S.,
PhraseCut: Language-Based Image Segmentation in the Wild,
CVPR20(10213-10222)
IEEE DOI
2008
Visualization, Grounding, Image segmentation, Task analysis,
Genomics, Bioinformatics, Natural languages
BibRef
Selvaraju, R.R.,
Tendulkar, P.,
Parikh, D.,
Horvitz, E.,
Tulio Ribeiro, M.,
Nushi, B.,
Kamar, E.,
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions,
CVPR20(10000-10008)
IEEE DOI
2008
Cognition, Task analysis, Visualization, Image color analysis,
Grounding, Text recognition, Computational modeling
BibRef
Chen, L.[Lei],
Zhai, M.Y.[Meng-Yao],
He, J.W.[Jia-Wei],
Mori, G.[Greg],
Object Grounding via Iterative Context Reasoning,
MDALC19(1407-1415)
IEEE DOI
2004
Localize set of queries in the image.
image classification, image representation, image segmentation,
inference mechanisms, iterative methods, query processing,
weakly supervised learning
BibRef
Sinha, A.[Abhishek],
Akilesh, B.,
Sarkar, M.[Mausoom],
Krishnamurthy, B.[Balaji],
Attention Based Natural Language Grounding by Navigating Virtual
Environment,
WACV19(236-244)
IEEE DOI
1904
learning (artificial intelligence),
natural language processing, virtual reality,
Grounding
BibRef
Selvaraju, R.R.,
Lee, S.,
Shen, Y.,
Jin, H.,
Ghosh, S.,
Heck, L.,
Batra, D.,
Parikh, D.,
Taking a HINT: Leveraging Explanations to Make Vision and Language
Models More Grounded,
ICCV19(2591-2600)
IEEE DOI
2004
gradient methods, image retrieval, natural language processing,
neural nets, question answering (information retrieval), HINT,
Correlation
BibRef
Zhang, Y.,
Niebles, J.C.,
Soto, A.,
Interpretable Visual Question Answering by Visual Grounding From
Attention Supervision Mining,
WACV19(349-357)
IEEE DOI
1904
data mining, data visualisation, image representation,
learning (artificial intelligence),
Computer architecture
BibRef
Shi, J.[Jing],
Xu, J.[Jia],
Gong, B.Q.[Bo-Qing],
Xu, C.L.[Chen-Liang],
Not All Frames Are Equal: Weakly-Supervised Video Grounding With
Contextual Similarity and Visual Clustering Losses,
CVPR19(10436-10444).
IEEE DOI
2002
BibRef
Datta, S.[Samyak],
Sikka, K.[Karan],
Roy, A.[Anirban],
Ahuja, K.[Karuna],
Parikh, D.[Devi],
Divakaran, A.[Ajay],
Align2Ground: Weakly Supervised Phrase Grounding Guided by
Image-Caption Alignment,
ICCV19(2601-2610)
IEEE DOI
2004
image representation, image retrieval,
learning (artificial intelligence), Image coding
BibRef
Fang, Z.Y.[Zhi-Yuan],
Kong, S.[Shu],
Fowlkes, C.C.[Charless C.],
Yang, Y.Z.[Ye-Zhou],
Modularized Textual Grounding for Counterfactual Resilience,
CVPR19(6371-6381).
IEEE DOI
2002
BibRef
Zhuang, B.,
Wu, Q.,
Shen, C.,
Reid, I.D.,
van den Hengel, A.J.[Anton J.],
Parallel Attention: A Unified Framework for Visual Object Discovery
Through Dialogs and Queries,
CVPR18(4252-4261)
IEEE DOI
1812
Visualization, Task analysis, Cognition, Proposals, Grounding,
Correlation
BibRef
Yang, Z.Y.[Zheng-Yuan],
Chen, T.L.[Tian-Lang],
Wang, L.W.[Li-Wei],
Luo, J.B.[Jie-Bo],
Improving One-Stage Visual Grounding by Recursive Sub-query
Construction,
ECCV20(XIV:387-404).
Springer DOI
2011
Code, Query.
WWW Link.
BibRef
Zhang, H.W.[Han-Wang],
Niu, Y.L.[Yu-Lei],
Chang, S.F.[Shih-Fu],
Grounding Referring Expressions in Images by Variational Context,
CVPR18(4158-4166)
IEEE DOI
1812
Grounding, Context modeling, Task analysis, Visualization,
Pediatrics, Bayes methods, Natural languages
BibRef
Yu, L.C.[Li-Cheng],
Lin, Z.[Zhe],
Shen, X.H.[Xiao-Hui],
Yang, J.M.[Ji-Mei],
Lu, X.[Xin],
Bansal, M.[Mohit],
Berg, T.L.[Tamara L.],
MAttNet: Modular Attention Network for Referring Expression
Comprehension,
CVPR18(1307-1315)
IEEE DOI
1812
Localize image region described by natural language expression.
Visualization, Computational modeling, Task analysis, Cats,
Adaptation models, Feature extraction, Knowledge discovery
BibRef
Liu, D.Q.[Da-Qing],
Zhang, H.W.[Han-Wang],
Zha, Z.J.[Zheng-Jun],
Wu, F.[Feng],
Learning to Assemble Neural Module Tree Networks for Visual Grounding,
ICCV19(4672-4681)
IEEE DOI
2004
approximation theory, data visualisation, grammars,
learning (artificial intelligence), Training
BibRef
Sadhu, A.,
Chen, K.,
Nevatia, R.,
Zero-Shot Grounding of Objects From Natural Language Queries,
ICCV19(4693-4702)
IEEE DOI
2004
image classification, learning (artificial intelligence), Visualization,
natural language processing, object detection, query processing.
BibRef
Yang, Z.Y.[Zheng-Yuan],
Gong, B.Q.[Bo-Qing],
Wang, L.W.[Li-Wei],
Huang, W.B.[Wen-Bing],
Yu, D.[Dong],
Luo, J.B.[Jie-Bo],
A Fast and Accurate One-Stage Approach to Visual Grounding,
ICCV19(4682-4692)
IEEE DOI
2004
document image processing, feature extraction, image fusion,
image segmentation, natural language processing, Encoding
BibRef
Rohrbach, A.[Anna],
Rohrbach, M.[Marcus],
Tang, S.[Siyu],
Oh, S.J.[Seong Joon],
Schiele, B.[Bernt],
Generating Descriptions with Grounded and Co-referenced People,
CVPR17(4196-4206)
IEEE DOI
1711
Movie description.
Grounding, Head, Joining processes, Motion pictures, Videos, Visualization
BibRef
Zhu, Y.,
Kiros, R.,
Zemel, R.,
Salakhutdinov, R.,
Urtasun, R.,
Torralba, A.B.,
Fidler, S.,
Aligning Books and Movies: Towards Story-Like Visual Explanations by
Watching Movies and Reading Books,
ICCV15(19-27)
IEEE DOI
1602
Grounding
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Internet Label Information .