Das, A.[Abhishek],
Kottur, S.[Satwik],
Gupta, K.[Khushi],
Singh, A.[Avi],
Yadav, D.[Deshraj],
Lee, S.[Stefan],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Visual Dialog,
PAMI(41), No. 5, May 2019, pp. 1242-1256.
IEEE DOI
1904
Hold a meaningful dialog about visual content.
Visualization, Task analysis, Artificial intelligence, History,
Protocols, Natural languages, Wheelchairs, Visual dialog,
machine learning
BibRef
Zhao, Z.[Zhou],
Zhang, Z.[Zhu],
Jiang, X.H.[Xing-Hua],
Cai, D.[Deng],
Multi-Turn Video Question Answering via Hierarchical Attention
Context Reinforced Networks,
IP(28), No. 8, August 2019, pp. 3860-3872.
IEEE DOI
1907
learning (artificial intelligence), natural language processing,
reinforcement learning
BibRef
Gu, M.[Mao],
Zhao, Z.[Zhou],
Jin, W.[Weike],
Cai, D.[Deng],
Wu, F.[Fei],
Video Dialog via Multi-Grained Convolutional Self-Attention Context
Multi-Modal Networks,
CirSysVideo(30), No. 12, December 2020, pp. 4453-4466.
IEEE DOI
2012
Visualization, Knowledge discovery, History, Task analysis,
Context modeling, Decoding, Computational modeling, Video dialog,
convolution
BibRef
Guo, D.,
Wang, H.,
Wang, S.,
Wang, M.,
Textual-Visual Reference-Aware Attention Network for Visual Dialog,
IP(29), 2020, pp. 6655-6666.
IEEE DOI
2007
Visualization, Semantics, History, Correlation, Head, Cognition,
Task analysis, Visual dialog, attention network, textual reference,
multimodal semantic interaction
BibRef
Patro, B.N.[Badri N.],
Anupriy,
Namboodiri, V.P.[Vinay P.],
Probabilistic framework for solving visual dialog,
PR(110), 2021, pp. 107586.
Elsevier DOI
2011
CNN, LSTM, Uncertainty, Aleatoric uncertainty,
Epistemic uncertainty vision and language, Visual dialog, VQA,
Bayesian deep learning
BibRef
Zhao, L.[Lei],
Lyu, X.Y.[Xin-Yu],
Song, J.K.[Jing-Kuan],
Gao, L.L.[Lian-Li],
GuessWhich? Visual dialog with attentive memory network,
PR(114), 2021, pp. 107823.
Elsevier DOI
2103
Visual dialog, Attentive memory network, Reinforcement learning
BibRef
Jiang, T.L.[Tian-Ling],
Shao, H.L.[Hai-Lin],
Tian, X.[Xin],
Ji, Y.[Yi],
Liu, C.P.[Chun-Ping],
Aligning vision-language for graph inference in visual dialog,
IVC(116), 2021, pp. 104316.
Elsevier DOI
2112
Visual dialog, Alignment, Graph inference, Scene graph
BibRef
Guo, D.[Dan],
Wang, H.[Hui],
Wang, M.[Meng],
Context-Aware Graph Inference With Knowledge Distillation for Visual
Dialog,
PAMI(44), No. 10, October 2022, pp. 6056-6073.
IEEE DOI
2209
Visualization, Task analysis, History, Cognition, Semantics,
Linguistics, Image edge detection, Visual dialog,
knowledge distillation
BibRef
Guo, D.[Dan],
Wang, H.[Hui],
Zhang, H.W.[Han-Wang],
Zha, Z.J.[Zheng-Jun],
Wang, M.[Meng],
Iterative Context-Aware Graph Inference for Visual Dialog,
CVPR20(10052-10061)
IEEE DOI
2008
Visualization, History, Task analysis, Semantics, Message passing,
Neural networks, Cognition
BibRef
Patro, B.N.[Badri N.],
Anupriy,
Namboodiri, V.P.[Vinay P.],
Explanation vs. attention: A two-player game to obtain attention for
VQA and visual dialog,
PR(132), 2022, pp. 108898.
Elsevier DOI
2209
CNN, LSTM, Explanation, Attention, Grad-CAM, MMD, CORAL, GAN, VQA,
Visual Dialog, Deep learning
BibRef
Zhu, Y.[Ye],
Wu, Y.[Yu],
Yang, Y.[Yi],
Yan, Y.[Yan],
Saying the Unseen: Video Descriptions via Dialog Agents,
PAMI(44), No. 10, October 2022, pp. 7190-7204.
IEEE DOI
2209
Task analysis, Visualization, Artificial intelligence,
Natural languages, Knowledge transfer, Semantics,
multi-modal learning
BibRef
Huang, Y.[Yan],
Wang, Y.M.[Yu-Ming],
Wang, L.[Liang],
Efficient Image and Sentence Matching,
PAMI(45), No. 3, March 2023, pp. 2970-2983.
IEEE DOI
2302
Matrix decomposition, Symmetric matrices, Computational modeling,
Predictive models, Analytical models, Task analysis,
vision and language
BibRef
Zhao, L.[Lei],
Li, J.L.[Jun-Lin],
Gao, L.L.[Lian-Li],
Rao, Y.[Yunbo],
Song, J.K.[Jing-Kuan],
Shen, H.T.[Heng Tao],
Heterogeneous Knowledge Network for Visual Dialog,
CirSysVideo(33), No. 2, February 2023, pp. 861-871.
IEEE DOI
2302
Visualization, Feature extraction, Task analysis, History,
Knowledge engineering, Semantics, Meters, Heterogeneous knowledge,
visual dialog
BibRef
Buçinca, Z.[Zana],
Yemez, Y.[Yücel],
Erzin, E.[Engin],
Sezgin, M.[Metin],
AffectON: Incorporating Affect Into Dialog Generation,
AffCom(14), No. 1, January 2023, pp. 823-835.
IEEE DOI
2303
Task analysis, Decoding, Training, Syntactics, Semantics, Computers,
Recurrent neural networks, Affective computing, affective dialog generation
BibRef
Tripathi, A.[Aditay],
Mishra, A.[Anand],
Chakraborty, A.[Anirban],
Grounding Scene Graphs on Natural Images via Visio-Lingual Message
Passing,
WACV23(4380-4389)
IEEE DOI
2302
Location awareness, Visualization, Grounding, Message passing,
Image edge detection, Semantics, Directed graphs,
Vision + language and/or other modalities
BibRef
Byun, J.[Jaeseok],
Hwang, T.[Taebaek],
Fu, J.L.[Jian-Long],
Moon, T.[Taesup],
GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language
Pre-training,
ECCV22(XIX:395-412).
Springer DOI
2211
WWW Link.
BibRef
Yan, S.P.[Shi-Peng],
Hong, L.[Lanqing],
Xu, H.[Hang],
Han, J.H.[Jian-Hua],
Tuytelaars, T.[Tinne],
Li, Z.G.[Zhen-Guo],
He, X.M.[Xu-Ming],
Generative Negative Text Replay for Continual Vision-Language
Pretraining,
ECCV22(XXXVI:22-38).
Springer DOI
2211
BibRef
Cai, Z.W.[Zhao-Wei],
Kwon, G.[Gukyeong],
Ravichandran, A.[Avinash],
Bas, E.[Erhan],
Tu, Z.W.[Zhuo-Wen],
Bhotika, R.[Rahul],
Soatto, S.[Stefano],
X-DETR: A Versatile Architecture for Instance-wise Vision-Language
Tasks,
ECCV22(XXXVI:290-308).
Springer DOI
2211
BibRef
Zhang, Y.F.[Yi-Feng],
Jiang, M.[Ming],
Zhao, Q.[Qi],
New Datasets and Models for Contextual Reasoning in Visual Dialog,
ECCV22(XXXVI:434-451).
Springer DOI
2211
BibRef
Pham, H.A.[Hoang-Anh],
Le, T.M.[Thao Minh],
Le, V.[Vuong],
Phuong, T.M.[Tu Minh],
Tran, T.[Truyen],
Video Dialog as Conversation About Objects Living in Space-Time,
ECCV22(XXIX:710-726).
Springer DOI
2211
BibRef
Zhang, Z.F.[Ze-Fan],
Jiang, T.L.[Tian-Ling],
Liu, C.P.[Chun-Ping],
Ji, Y.[Yi],
Coupling Attention and Convolution for Heuristic Network in Visual
Dialog,
ICIP22(2896-2900)
IEEE DOI
2211
Couplings, Visualization, Convolution, Semantics, Benchmark testing,
Thalamus, Visual dialog, attention, convolution
BibRef
Zhang, H.Y.[Hang-Yu],
Li, Y.M.[Ying-Ming],
Zhang, Z.F.[Zhong-Fei],
Video-Grounded Dialogues with Joint Video and Image Training,
ICIP22(3903-3907)
IEEE DOI
2211
Training, Visualization, Transformers, Feature extraction,
Data mining, Video-grounded Dialogues, Multimodality,
Transformer
BibRef
Zhang, S.[Shunyu],
Jiang, X.Z.[Xiao-Ze],
Yang, Z.[Zequn],
Wan, T.[Tao],
Qin, Z.[Zengchang],
Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog,
MULA22(4599-4608)
IEEE DOI
2210
Visualization, Fuses, Semantics, Knowledge based systems,
Oral communication, Transformers, Pattern recognition
BibRef
Zhu, Y.[Yi],
Weng, Y.[Yue],
Zhu, F.D.[Feng-Da],
Liang, X.D.[Xiao-Dan],
Ye, Q.X.[Qi-Xiang],
Lu, Y.T.[Yu-Tong],
Jiao, J.B.[Jian-Bin],
Self-Motivated Communication Agent for Real-World Vision-Dialog
Navigation,
ICCV21(1574-1583)
IEEE DOI
2203
Costs, Uncertainty, Navigation, Annotations, Reinforcement learning,
Optimization, Vision+language,
BibRef
Engin, D.[Deniz],
Schnitzler, F.[François],
Duong, N.Q.K.[Ngoc Q. K.],
Avrithis, Y.[Yannis],
On the hidden treasure of dialog in video question answering,
ICCV21(2044-2053)
IEEE DOI
2203
Location awareness, TV, Codes, Video description, Annotations,
Knowledge based systems, Video analysis and understanding, Vision + language
BibRef
Matsumori, S.[Shoya],
Shingyouchi, K.[Kosuke],
Abe, Y.[Yuki],
Fukuchi, Y.[Yosuke],
Sugiura, K.[Komei],
Imai, M.[Michita],
Unified Questioner Transformer for Descriptive Question Generation in
Goal-Oriented Visual Dialogue,
ICCV21(1878-1887)
IEEE DOI
2203
Visualization, Buildings, Transformers,
Task analysis, Artificial intelligence, Vision + language,
Visual reasoning and logical representation
BibRef
Tu, T.[Tao],
Ping, Q.[Qing],
Thattai, G.[Govindarajan],
Tur, G.[Gokhan],
Natarajan, P.[Prem],
Learning Better Visual Dialog Agents with Pretrained
Visual-Linguistic Representation,
CVPR21(5618-5627)
IEEE DOI
2111
Visualization, Games, Reinforcement learning,
Generators, Encoding, Pattern recognition
BibRef
Jiang, T.L.[Tian-Ling],
Ji, Y.[Yi],
Liu, C.P.[Chun-Ping],
Integrating Historical States and Co-attention Mechanism for Visual
Dialog,
ICPR21(2041-2048)
IEEE DOI
2105
Visualization, Benchmark testing, Cognition,
History, Task analysis, Faces
BibRef
Nguyen, V.Q.[Van-Quang],
Suganuma, M.[Masanori],
Okatani, T.[Takayuki],
Efficient Attention Mechanism for Visual Dialog that Can Handle All the
Interactions Between Multiple Inputs,
ECCV20(XXIV:223-240).
Springer DOI
2012
BibRef
Murahari, V.[Vishvak],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Das, A.[Abhishek],
Large-scale Pretraining for Visual Dialog:
A Simple State-of-the-art Baseline,
ECCV20(XVIII:336-352).
Springer DOI
2012
BibRef
Zhu, Y.[Ye],
Wu, Y.[Yu],
Yang, Y.[Yi],
Yan, Y.[Yan],
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents,
ECCV20(XXIII:153-169).
Springer DOI
2011
BibRef
Qi, J.,
Niu, Y.,
Huang, J.,
Zhang, H.,
Two Causal Principles for Improving Visual Dialog,
CVPR20(10857-10866)
IEEE DOI
2008
Visualization, History, Task analysis, Data models, Training, Feeds, Decoding
BibRef
Abbasnejad, E.[Ehsan],
Teney, D.[Damien],
Parvaneh, A.[Amin],
Shi, J.[Javen],
van den Hengel, A.J.[Anton J.],
Counterfactual Vision and Language Learning,
CVPR20(10041-10051)
IEEE DOI
2008
Training, Visualization, Training data, Task analysis,
Machine learning, Knowledge discovery, Data models
BibRef
Zhu, Y.,
Zhu, F.,
Zhan, Z.,
Lin, B.,
Jiao, J.,
Chang, X.,
Liang, X.,
Vision-Dialog Navigation by Exploring Cross-Modal Memory,
CVPR20(10727-10736)
IEEE DOI
2008
Navigation, Visualization, Task analysis, History, Memory modules,
Natural languages, Decision making
BibRef
Yang, T.,
Zha, Z.,
Zhang, H.,
Making History Matter:
History-Advantage Sequence Training for Visual Dialog,
ICCV19(2561-2569)
IEEE DOI
2004
image retrieval, image sequences, interactive systems, neural nets,
question answering (information retrieval), Decoding
BibRef
Guo, D.[Dalu],
Xu, C.[Chang],
Tao, D.C.[Da-Cheng],
Image-Question-Answer Synergistic Network for Visual Dialog,
CVPR19(10426-10435).
IEEE DOI
2002
BibRef
Zheng, Z.L.[Zi-Long],
Wang, W.G.[Wen-Guan],
Qi, S.Y.[Si-Yuan],
Zhu, S.C.[Song-Chun],
Reasoning Visual Dialogs With Structural and Partial Observations,
CVPR19(6662-6671).
IEEE DOI
2002
BibRef
Bani, G.[Gabriele],
Belli, D.[Davide],
Dagan, G.[Gautier],
Geenen, A.[Alexander],
Skliar, A.[Andrii],
Venkatesh, A.[Aashish],
Baumgärtner, T.[Tim],
Bruni, E.[Elia],
Fernández, R.[Raquel],
Adding Object Detection Skills to Visual Dialogue Agents,
VL18(IV:180-187).
Springer DOI
1905
BibRef
Yang, M.,
Yang, N.S.R.,
Zhang, K.,
Tao, J.,
Self-Talk: Responses to Users' Opinions and Challenges in Human
Computer Dialog,
ICPR18(2839-2844)
IEEE DOI
1812
History, Robots, Databases, Predictive models, Pattern recognition,
Automation, Search engines, human computer dialog, abstract extraction
BibRef
Jain, U.,
Schwing, A.,
Lazebnik, S.,
Two Can Play This Game: Visual Dialog with Discriminative Question
Generation and Answering,
CVPR18(5754-5763)
IEEE DOI
1812
Visualization, Task analysis, History, Knowledge discovery,
Measurement, Training, Computer architecture
BibRef
Dokania, P.K.,
Torr, P.H.S.,
Siddharth, N.,
Massiceti, D.,
FLIPDIAL: A Generative Model for Two-Way Visual Dialogue,
CVPR18(6097-6105)
IEEE DOI
1812
Visualization, Task analysis, Computational modeling, History,
Data models, Pediatrics, Image color analysis
BibRef
Wu, Q.,
Wang, P.,
Shen, C.,
Reid, I.D.,
van den Hengel, A.J.[Anton J.],
Are You Talking to Me? Reasoned Visual Dialog Generation Through
Adversarial Learning,
CVPR18(6106-6115)
IEEE DOI
1812
Visualization, Task analysis, Generators, History,
Computational modeling, Image color analysis
BibRef
Kottur, S.[Satwik],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Rohrbach, M.[Marcus],
Visual Coreference Resolution in Visual Dialog Using Neural Module
Networks,
ECCV18(XV: 160-178).
Springer DOI
1810
BibRef
Strub, F.[Florian],
Seurin, M.[Mathieu],
Perez, E.[Ethan],
de Vries, H.[Harm],
Mary, J.[Jérémie],
Preux, P.[Philippe],
Courville, A.[Aaron],
Pietquin, O.[Olivier],
Visual Reasoning with Multi-hop Feature Modulation,
ECCV18(VI: 808-831).
Springer DOI
1810
BibRef
Das, A.,
Kottur, S.,
Moura, J.M.F.,
Lee, S.,
Batra, D.,
Learning Cooperative Visual Dialog Agents with Deep Reinforcement
Learning,
ICCV17(2970-2979)
IEEE DOI
1802
interactive systems, learning (artificial intelligence),
multi-agent systems, natural language interfaces, robot vision,
Visualization
BibRef
de Vries, H.[Harm],
Strub, F.[Florian],
Chandar, S.[Sarath],
Pietquin, O.[Olivier],
Larochelle, H.[Hugo],
Courville, A.[Aaron],
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue,
CVPR17(4466-4475)
IEEE DOI
1711
Databases, Games, Knowledge discovery,
Natural languages, Visualization
BibRef
Nam, H.[Hyeonseob],
Ha, J.W.[Jung-Woo],
Kim, J.[Jeonghee],
Dual Attention Networks for Multimodal Reasoning and Matching,
CVPR17(2156-2164)
IEEE DOI
1711
Cognition, Knowledge discovery, Mathematical model,
Neural networks, Semantics, Visualization
BibRef
Johnson, J.[Justin],
Hariharan, B.[Bharath],
van der Maaten, L.[Laurens],
Hoffman, J.,
Fei-Fei, L.[Li],
Zitnick, C.L.[C. Lawrence],
Girshick, R.[Ross],
Inferring and Executing Programs for Visual Reasoning,
ICCV17(3008-3017)
IEEE DOI
1802
BibRef
Earlier: A1, A2, A3, A5, A6, A7, Only:
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
Visual Reasoning,
CVPR17(1988-1997)
IEEE DOI
1711
Dataset, Visual Reasoning.
WWW Link.
backpropagation, image matching,
learning (artificial intelligence), neural nets,
Visualization.
Cognition, Image color analysis, Metals, Semantics, Shape.
BibRef
Das, A.[Abhishek],
Kottur, S.[Satwik],
Gupta, K.[Khushi],
Singh, A.[Avi],
Yadav, D.[Deshraj],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Visual Dialog,
CVPR17(1080-1089)
IEEE DOI
1711
Hold a dialog with humans in a natural visual context.
History, Knowledge discovery, Protocols, Visualization, Wheelchairs
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Image-Text Matching, Image Text Retrieval .