Das, A.[Abhishek],
Agrawal, H.[Harsh],
Zitnick, L.[Larry],
Parikh, D.[Devi],
Batra, D.[Dhruv],
Human Attention in Visual Question Answering:
Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI
1712
Visual Question Answering
BibRef
Malinowski, M.[Mateusz],
Rohrbach, M.[Marcus],
Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI
1711
BibRef
Earlier:
Ask Your Neurons:
A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI
1602
Deep learning for questions about real-world images.
A Visual Turing Test.
Language output based on visual and natural language input.
BibRef
Dancette, C.[Corentin],
Whitehead, S.[Spencer],
Maheshwary, R.[Rishabh],
Vedantam, R.[Ramakrishna],
Scherer, S.[Stefan],
Chen, X.L.[Xin-Lei],
Cord, M.[Matthieu],
Rohrbach, M.[Marcus],
Improving Selective Visual Question Answering by Learning from Your
Peers,
CVPR23(24049-24059)
IEEE DOI
2309
BibRef
Huang, Y.Z.[Yan-Zhou],
Zhong, T.[Tao],
Multitask learning for neural generative question answering,
RealTimeIP(14), No. 1, January 2018, pp. 1009-1017.
WWW Link.
1809
BibRef
Ruwa, N.[Nelson],
Mao, Q.[Qirong],
Song, H.P.[He-Ping],
Jia, H.J.[Hong-Jie],
Dong, M.[Ming],
Triple attention network for sentimental visual question answering,
CVIU(189), 2019, pp. 102829.
Elsevier DOI
1911
Visual question answering, Feature embedding, Attention model,
Sentiment analysis
BibRef
Bai, Z.W.[Zong-Wen],
Li, Y.[Ying],
Wozniak, M.[Marcin],
Zhou, M.L.[Mei-Li],
Li, D.[Di],
DecomVQANet: Decomposing visual question answering deep network via
tensor decomposition and regression,
PR(110), 2021, pp. 107538.
Elsevier DOI
2011
Tensor decomposition, Tensor regression layer,
Tensor contraction layer, Visual question answering
BibRef
Zhang, Q.S.[Quan-Shi],
Wu, Y.N.[Ying Nian],
Zhang, H.[Hao],
Zhu, S.C.[Song-Chun],
Mining deep And-Or object structures via cost-sensitive
question-answer-based active annotations,
CVIU(176-177), 2018, pp. 33-44.
Elsevier DOI
1812
Hierarchical graphical model, Part semantics
BibRef
Zhang, Q.S.[Quan-Shi],
Ren, J.[Jie],
Huang, G.[Ge],
Cao, R.M.[Rui-Ming],
Wu, Y.N.[Ying Nian],
Zhu, S.C.[Song-Chun],
Mining Interpretable AOG Representations From Convolutional Networks
via Active Question Answering,
PAMI(43), No. 11, November 2021, pp. 3949-3963.
IEEE DOI
2110
BibRef
Earlier: A1, A4, A5, A6, Only:
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI
1711
BibRef
Earlier: A1, A5, A6, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI
1602
Semantics, Visualization, Head, Magnetic heads, Neural networks,
Information filters, Convolutional neural networks,
part localization.
Object detection, Object recognition, Semantics, Strain, Training,
Visualization
BibRef
Cao, Q.X.[Qing-Xing],
Liang, X.D.[Xiao-Dan],
Li, B.L.[Bai-Lin],
Lin, L.[Liang],
Interpretable Visual Question Answering by Reasoning on Dependency
Trees,
PAMI(43), No. 3, March 2021, pp. 887-901.
IEEE DOI
2102
Cognition, Visualization, Layout, Logic gates, Task analysis,
Knowledge discovery, Image coding, Visual question answering,
attention model
BibRef
Cao, Q.X.[Qing-Xing],
Liang, X.D.[Xiao-Dan],
Li, B.L.[Bai-Lin],
Li, G.,
Lin, L.[Liang],
Visual Question Reasoning on General Dependency Tree,
CVPR18(7249-7257)
IEEE DOI
1812
Cognition, Visualization, Layout, Feature extraction, Task analysis,
Collaboration, Neural networks
BibRef
Zhong, H.S.[Hua-Song],
Chen, J.Y.[Jing-Yuan],
Shen, C.[Chen],
Zhang, H.W.[Han-Wang],
Huang, J.Q.[Jian-Qiang],
Hua, X.S.[Xian-Sheng],
Self-Adaptive Neural Module Transformer for Visual Question Answering,
MultMed(23), 2021, pp. 1264-1273.
IEEE DOI
2105
Layout, Cognition, Task analysis, Visualization, Neural networks,
Knowledge discovery, Decoding, Visual question answering,
self-adaptive
BibRef
Zheng, W.F.[Wen-Feng],
Yin, L.R.[Li-Rong],
Chen, X.B.[Xia-Bing],
Ma, Z.Y.[Zhi-Yang],
Liu, S.[Shan],
Yang, B.[Bo],
Knowledge base graph embedding module design for Visual question
answering model,
PR(120), 2021, pp. 108153.
Elsevier DOI
2109
Faster R-CNN, DBpedia spotlight, knowledge base, VQA
BibRef
Sharma, H.[Himanshu],
Jalal, A.S.[Anand Singh],
Visual question answering model based on graph neural network and
contextual attention,
IVC(110), 2021, pp. 104165.
Elsevier DOI
2106
Visual question answering,
Natural language processing, Attention
BibRef
Song, L.Y.[Ling-Yun],
Li, J.[Jianao],
Liu, J.[Jun],
Yang, Y.[Yang],
Shang, X.[Xuequn],
Sun, M.X.[Ming-Xuan],
Answering knowledge-based visual questions via the exploration of
Question Purpose,
PR(133), 2023, pp. 109015.
Elsevier DOI
2210
Visual question answering, DNN, Question Purpose
BibRef
MeshuWelde, T.[Tesfayee],
Liao, L.[Lejian],
Counting-based visual question answering with serial cascaded
attention deep learning,
PR(144), 2023, pp. 109850.
Elsevier DOI
2310
Counting-based visual question answering, Visual geometry group16,
Text convolutional neural network,
Serial cascaded recurrent neural network with attention
mechanism-based long short-term memory
BibRef
Liu, Y.[Yang],
Li, G.B.[Guan-Bin],
Lin, L.[Liang],
Cross-Modal Causal Relational Reasoning for Event-Level Visual
Question Answering,
PAMI(45), No. 10, October 2023, pp. 11624-11641.
IEEE DOI
2310
BibRef
Cao, Q.X.[Qing-Xing],
Wan, W.T.[Wen-Tao],
Wang, K.[Keze],
Liang, X.D.[Xiao-Dan],
Lin, L.[Liang],
Linguistically Routing Capsule Network for Out-of-distribution Visual
Question Answering,
ICCV21(1594-1603)
IEEE DOI
2203
Visualization, Correlation, Fuses, Computational modeling, Merging,
Training data, Vision + language,
BibRef
Yang, S.W.[Shu-Wen],
Xiao, L.[Luwei],
Wu, X.J.[Xing-Jiao],
Xu, J.J.[Jun-Jie],
Wang, L.L.[Lin-Lin],
He, L.[Liang],
Simple contrastive learning in a self-supervised manner for robust
visual question answering,
CVIU(241), 2024, pp. 103976.
Elsevier DOI
2403
Visual question answering, Deep learning, Contrastive learning,
Information extraction
BibRef
Wu, Y.L.[Yong-Liang],
Pan, X.[Xiao],
Li, J.H.[Jing-Hui],
Dou, S.[Shimao],
Wang, X.X.[Xiao-Xue],
Interpretable answer retrieval based on heterogeneous network
embedding,
PRL(182), 2024, pp. 9-16.
Elsevier DOI
2405
Interpretable question answering, Entity relationship fusion,
Heterogeneous Graph Embedding, Meta Path
BibRef
Sima, C.[Chonghao],
Renz, K.[Katrin],
Chitta, K.[Kashyap],
Chen, L.[Li],
Zhang, H.[Hanxue],
Xie, C.G.[Chen-Gen],
Beißwenger, J.[Jens],
Luo, P.[Ping],
Geiger, A.[Andreas],
Li, H.Y.[Hong-Yang],
Drivelm: Driving with Graph Visual Question Answering,
ECCV24(LII: 256-274).
Springer DOI
2412
BibRef
Feng, C.[Chen],
Danier, D.[Duolikun],
Zhang, F.[Fan],
Bull, D.[David],
RankDVQA: Deep VQA based on Ranking-inspired Hybrid Training,
WACV24(1637-1647)
IEEE DOI Code:
WWW Link.
2404
Training, Measurement, Deep learning, Correlation, Databases,
Source coding, Network architecture, Algorithms,
Datasets and evaluations
BibRef
Ishay, A.[Adam],
Yang, Z.[Zhun],
Lee, J.[Joohyung],
Kang, I.[Ilgu],
Lim, D.J.[Dong-Jae],
Think before You Simulate: Symbolic Reasoning to Orchestrate Neural
Computation for Counterfactual Question Answering,
WACV24(6684-6693)
IEEE DOI
2404
Logic programming, Computational modeling, Computer architecture,
Benchmark testing, Predictive models, Cognition, Algorithms,
Vision + language and/or other modalities
BibRef
Wang, Y.[Yanan],
Yasunaga, M.[Michihiro],
Ren, H.Y.[Hong-Yu],
Wada, S.[Shinya],
Leskovec, J.[Jure],
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural
Networks for Visual Question Answering,
ICCV23(21525-21535)
IEEE DOI
2401
BibRef
Souza, B.[Bruno],
Aasan, M.[Marius],
Pedrini, H.[Helio],
Rivera, A.R.[Adin Ramirez],
SelfGraphVQA: A Self-Supervised Graph Neural Network for Scene-based
Question Answering,
VLAR23(4642-4647)
IEEE DOI
2401
BibRef
Haisa, G.[Gulizada],
Altenbek, G.[Gulila],
Question Classification Based on Weak Supervision and Interrogative
Pronouns Attention Mechanism,
ICPR22(2273-2278)
IEEE DOI
2212
Deep learning, Dictionaries, Costs, Annotations, Neural networks,
Feature extraction, Question answering (information retrieval), Kazakh
BibRef
Nguyen, B.X.[Binh X.],
Do, T.[Tuong],
Tran, H.[Huy],
Tjiputra, E.[Erman],
Tran, Q.D.[Quang D.],
Nguyen, A.[Anh],
Coarse-to-Fine Reasoning for Visual Question Answering,
MULA22(4557-4565)
IEEE DOI
2210
Deep learning, Visualization, Codes, Semantics, Neural networks,
Feature extraction, Cognition
BibRef
Liang, Y.Y.[Yao-Yuan],
Wang, X.[Xin],
Duan, X.G.[Xu-Guang],
Zhu, W.W.[Wen-Wu],
Multi-modal Contextual Graph Neural Network for Text Visual Question
Answering,
ICPR21(3491-3498)
IEEE DOI
2105
Visualization, Image recognition, Text recognition,
Target recognition, Shape, Knowledge discovery, Graph neural networks
BibRef
Patro, B.N.,
Kurmi, V.K.,
Kumar, S.,
Namboodiri, V.P.,
Deep Bayesian Network for Visual Question Generation,
WACV20(1555-1565)
IEEE DOI
2006
Bayes methods, Task analysis, Visualization, Uncertainty, Decoding,
Probabilistic logic, Semantics
BibRef
Singh, A.K.,
Mishra, A.,
Shekhar, S.,
Chakraborty, A.,
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and
Reason,
ICCV19(4601-4611)
IEEE DOI
2004
document image processing, graph theory, inference mechanisms,
neural nets, text analysis, visual content proposals, Proposals
BibRef
Wilf, A.[Alex],
Ma, M.Q.[Martin Q.],
Liang, P.P.[Paul Pu],
Zadeh, A.[Amir],
Morency, L.P.[Louis-Philippe],
Face-to-Face Contrastive Learning for Social Intelligence
Question-Answering,
FG23(1-7)
IEEE DOI
2303
Heuristic algorithms, Face recognition, Gesture recognition,
Benchmark testing, Graph neural networks, Social intelligence, Task analysis
BibRef
Zadeh, A.[Amir],
Chan, M.[Michael],
Liang, P.P.[Paul Pu],
Tong, E.[Edmund],
Morency, L.P.[Louis-Philippe],
Social-IQ: A Question Answering Benchmark for Artificial Social
Intelligence,
CVPR19(8799-8809).
IEEE DOI
2002
BibRef
Ma, C.,
Shen, C.,
Dick, A.,
Wu, Q.,
Wang, P.,
van den Hengel, A.J.[Anton J.],
Reid, I.D.,
Visual Question Answering with Memory-Augmented Networks,
CVPR18(6975-6984)
IEEE DOI
1812
Visualization, Neural networks, Training, Knowledge discovery,
Feature extraction, Bidirectional control, Prediction algorithms
BibRef
Shin, A.,
Ushiku, Y.,
Harada, T.,
Customized Image Narrative Generation via Interactive Visual Question
Generation and Answering,
CVPR18(8925-8933)
IEEE DOI
1812
Visualization, Task analysis, Feature extraction, Proposals,
Knowledge discovery, Recurrent neural networks, Training
BibRef
Teney, D.,
Anderson, P.,
He, X.,
van den Hengel, A.J.[Anton J.],
Tips and Tricks for Visual Question Answering:
Learnings from the 2017 Challenge,
CVPR18(4223-4232)
IEEE DOI
1812
Training, Visualization, Task analysis, Neural networks,
Knowledge discovery, Logic gates, Computer architecture
BibRef
Bai, Y.L.[Ya-Long],
Fu, J.L.[Jian-Long],
Zhao, T.J.[Tie-Jun],
Mei, T.[Tao],
Deep Attention Neural Tensor Network for Visual Question Answering,
ECCV18(XII: 21-37).
Springer DOI
1810
BibRef
Sinha, A.[Abhishek],
Ayush, K.[Kumar],
Towards Mathematical Reasoning: A Multimodal Deep Learning Approach,
ICIP18(4028-4032)
IEEE DOI
1809
Mathematical model, Task analysis, Visualization, Decoding,
Computational modeling, Machine learning, Numerical models,
Mathematical Reasoning
BibRef
Rosso-Mateus, A.[Andrés],
González, F.A.[Fabio A.],
Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain
Question Answering,
CIARP17(566-574).
Springer DOI
1802
BibRef
Zhu, C.,
Zhao, Y.,
Huang, S.,
Tu, K.,
Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI
1802
belief networks, data visualisation, image retrieval,
inference mechanisms, neural nets,
Visualization
BibRef
Hu, R.,
Andreas, J.,
Rohrbach, M.,
Darrell, T.J.,
Saenko, K.,
Learning to Reason:
End-to-End Module Networks for Visual Question Answering,
ICCV17(804-813)
IEEE DOI
1802
computational linguistics, grammars, natural language processing,
neural net architecture,
Visualization
BibRef
Peris, Á.[Álvaro],
Casacuberta, F.[Francisco],
Interactive-Predictive Neural Multimodal Systems,
IbPRIA19(I:16-28).
Springer DOI
1910
BibRef
Bolaños, M.[Marc],
Peris, Á.[Álvaro],
Casacuberta, F.[Francisco],
Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question
Answering,
IbPRIA17(372-380).
Springer DOI
1706
BibRef
Kafle, K.[Kushal],
Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI
1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI
1612
case-based reasoning, data visualisation,
image retrieval, neural nets, Visualization
BibRef
Wang, P.,
Wu, Q.,
Shen, C.,
van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to
Answer New Questions,
CVPR17(3909-3918)
IEEE DOI
1711
Cognition, Data mining, Neural networks, Prediction algorithms,
Telescopes, Visualization
BibRef
Yu, D.,
Fu, J.,
Mei, T.,
Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI
1711
Feature extraction, Knowledge discovery, Natural languages,
Recurrent neural networks, Semantics, Visualization
BibRef
Ramakrishnan, S.K.,
Pal, A.,
Sharma, G.,
Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel
Objects,
CVPR17(7312-7321)
IEEE DOI
1711
Knowledge discovery, Recurrent neural networks, Training,
Training data, Visualization, Vocabulary
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Vision-Language Models, Language-Vision Models, VQA .