19.4.3.3 Visual Question Answering, Query, VQA

Chapter Contents (Back)
Question Answer. Visual Q-A. VQA. Subsets:
See also Video Question Answering, Movies, Spatio-Temporal, Query, VQA.
See also Visual Dialog. And the related:
See also Visual Grounding, Grounding Expressions.
See also Visual Question Answering, Datasets, Benchmarks, Surveys. Other Datasets may be in:
See also Object Recognition, Retrieval Datasets.
See also Context in Computer Vision.

Agrawal, A.[Aishwarya], Lu, J.[Jiasen], Antol, S.[Stanislaw], Mitchell, M.[Margaret], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi], Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI 1705
BibRef

Malinowski, M.[Mateusz], Rohrbach, M.[Marcus], Fritz, M.[Mario],
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering,
IJCV(125), No. 1-3, December 2018, pp. 110-135.
Springer DOI 1711
BibRef
Earlier:
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images,
ICCV15(1-9)
IEEE DOI 1602
Deep learning for questions about real-world images. A Visual Turing Test. Language output based on visual and natural language input. BibRef

Tamaazousti, Y.[Youssef], Le Borgne, H.[Hervé], Popescu, A.[Adrian], Gadeski, E.[Etienne], Ginsca, A.[Alexandru], Hudelot, C.[Céline],
Vision-language integration using constrained local semantic features,
CVIU(163), No. 1, 2017, pp. 41-57.
Elsevier DOI 1712
Image classification BibRef

Das, A.[Abhishek], Agrawal, H.[Harsh], Zitnick, L.[Larry], Parikh, D.[Devi], Batra, D.[Dhruv],
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?,
CVIU(163), No. 1, 2017, pp. 90-100.
Elsevier DOI 1712
Visual Question Answering BibRef

Lioutas, V.[Vasileios], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI 1808
Visual question answering, Explicit attention, Pictorial superiority effect BibRef

Huang, Y.Z.[Yan-Zhou], Zhong, T.[Tao],
Multitask learning for neural generative question answering,
RealTimeIP(14), No. 1, January 2018, pp. 1009-1017.
WWW Link. 1809
BibRef

Zhang, Q.S.[Quan-Shi], Wu, Y.N.[Ying Nian], Zhang, H.[Hao], Zhu, S.C.[Song-Chun],
Mining deep And-Or object structures via cost-sensitive question-answer-based active annotations,
CVIU(176-177), 2018, pp. 33-44.
Elsevier DOI 1812
Hierarchical graphical model, Part semantics BibRef

Zhang, Q.S.[Quan-Shi], Ren, J.[Jie], Huang, G.[Ge], Cao, R.M.[Rui-Ming], Wu, Y.N.[Ying Nian], Zhu, S.C.[Song-Chun],
Mining Interpretable AOG Representations From Convolutional Networks via Active Question Answering,
PAMI(43), No. 11, November 2021, pp. 3949-3963.
IEEE DOI 2110
BibRef
Earlier: A1, A4, A5, A6, Only:
Mining Object Parts from CNNs via Active Question-Answering,
CVPR17(3890-3899)
IEEE DOI 1711
BibRef
Earlier: A1, A5, A6, Only:
Mining And-Or Graphs for Graph Matching and Object Discovery,
ICCV15(55-63)
IEEE DOI 1602
Semantics, Visualization, Head, Magnetic heads, Neural networks, Information filters, Convolutional neural networks, part localization. Object detection, Object recognition, Semantics, Strain, Training, Visualization BibRef

Garg, S.[Shivam], Srivastava, R.[Rajeev],
Object sequences: encoding categorical and spatial information for a yes/no visual question answering task,
IET-CV(12), No. 8, December 2018, pp. 1141-1150.
DOI Link 1812
BibRef

Goyal, Y.[Yash], Khot, T.[Tejas], Agrawal, A.[Aishwarya], Summers-Stay, D.[Douglas], Batra, D.[Dhruv], Parikh, D.[Devi],
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering,
IJCV(127), No. 4, April 2019, pp. 398-414.
Springer DOI 1903
BibRef
Earlier: A1, A2, A4, A5, A6, Only: CVPR17(6325-6334)
IEEE DOI 1711
Benchmark testing, Data collection, Data models, Knowledge discovery, Protocols, Visualization BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Li, Y.[Yong], Qiao, Y.Y.[Yan-Yuan], Lu, H.Q.[Han-Qing],
Improving visual question answering using dropout and enhanced question encoder,
PR(90), 2019, pp. 404-414.
Elsevier DOI 1903
Visual question answering, Coherent dropout, Siamese dropout, Enhanced question encoder BibRef

Osman, A.[Ahmed], Samek, W.[Wojciech],
DRAU: Dual Recurrent Attention Units for Visual Question Answering,
CVIU(185), 2019, pp. 24-30.
Elsevier DOI 1906
Visual Question Answering, Attention Mechanisms, Multi-modal Learning, Machine Vision, Natural Language Processing BibRef

Toor, A.S.[Andeep S.], Wechsler, H.[Harry], Nappi, M.[Michele],
Biometric surveillance using visual question answering,
PRL(126), 2019, pp. 111-118.
Elsevier DOI 1909
Biometrics, Forensics, Visual question answering, Question relevance, Surveillance, Deep learning, Visual turing test BibRef

Ruwa, N.[Nelson], Mao, Q.[Qirong], Song, H.P.[He-Ping], Jia, H.J.[Hong-Jie], Dong, M.[Ming],
Triple attention network for sentimental visual question answering,
CVIU(189), 2019, pp. 102829.
Elsevier DOI 1911
Visual question answering, Feature embedding, Attention model, Sentiment analysis BibRef

Li, W.W.[Wen-Wen], Song, M.M.[Miao-Miao], Tian, Y.Y.[Yuan-Yuan],
An Ontology-Driven Cyberinfrastructure for Intelligent Spatiotemporal Question Answering and Open Knowledge Discovery,
IJGI(8), No. 11, 2019, pp. xx-yy.
DOI Link 1912
BibRef

Xi, Y.L.[Yu-Ling], Zhang, Y.N.[Yan-Ning], Ding, S.T.[Song-Tao], Wan, S.H.[Shao-Hua],
Visual Question Answering Model Based on Visual Relationship Detection,
SP:IC(80), 2020, pp. 115648.
Elsevier DOI 1912
Visual question answering, Appearance features, Relationship predicate, Word vector similarity BibRef

Wu, Y., Jiang, L., Yang, Y.,
Revisiting EmbodiedQA: A Simple Baseline and Beyond,
IP(29), 2020, pp. 3984-3992.
IEEE DOI 2002
Embodied question answering, vision and language, visual question answering BibRef

Huang, C.[Chaoran], Yao, L.[Lina], Wang, X.Z.[Xian-Zhi], Benatallah, B.[Boualem], Zhang, X.[Xiang],
Software expert discovery via knowledge domain embeddings in a collaborative network,
PRL(130), 2020, pp. 46-53.
Elsevier DOI 2002
Knowledge discovery, Stack overflow, Expertise finding, Question answering, Expert as a Service BibRef

Li, W.[Wei], Sun, J.H.[Jian-Hui], Liu, G.[Ge], Zhao, L.[Linglan], Fang, X.Z.[Xiang-Zhong],
Visual question answering with attention transfer and a cross-modal gating mechanism,
PRL(133), 2020, pp. 334-340.
Elsevier DOI 2005
Attention, Visual question answering, Gating BibRef

Messina, N.[Nicola], Amato, G.[Giuseppe], Carrara, F.[Fabio], Falchi, F.[Fabrizio], Gennaro, C.[Claudio],
Learning visual features for relational CBIR,
MultInfoRetr(9), No. 2, June 2020, pp. 113-124.
Springer DOI 2005
BibRef
Earlier:
Learning Relationship-Aware Visual Features,
CEFR-LCV18(IV:486-501).
Springer DOI 1905
BibRef

Methani, N., Ganguly, P., Khapra, M.M., Kumar, P.,
PlotQA: Reasoning over Scientific Plots,
WACV20(1516-1525)
IEEE DOI 2006
Vocabulary, Cognition, Bars, Numerical models, Optical character recognition software, Data mining, Image color analysis BibRef

Yu, J.[Jing], Zhu, Z.H.[Zi-Hao], Wang, Y.J.[Yu-Jing], Zhang, W.F.[Wei-Feng], Hu, Y.[Yue], Tan, J.L.[Jian-Long],
Cross-modal knowledge reasoning for knowledge-based visual question answering,
PR(108), 2020, pp. 107563.
Elsevier DOI 2008
Cross-modal knowledge reasoning, Multimodal knowledge graphs, Compositional reasoning module, Explainable reasoning BibRef

Yang, Z.Q.[Zhuo-Qian], Qin, Z.C.[Zeng-Chang], Yu, J.[Jing], Wan, T.[Tao],
Prior Visual Relationship Reasoning For Visual Question Answering,
ICIP20(1411-1415)
IEEE DOI 2011
Visualization, Semantics, Convolution, Cognition, Knowledge discovery, Benchmark testing, Measurement, VQA, GCN, Attention Mechanism BibRef

Bai, Z.W.[Zong-Wen], Li, Y.[Ying], Wozniak, M.[Marcin], Zhou, M.L.[Mei-Li], Li, D.[Di],
DecomVQANet: Decomposing visual question answering deep network via tensor decomposition and regression,
PR(110), 2021, pp. 107538.
Elsevier DOI 2011
Tensor decomposition, Tensor regression layer, Tensor contraction layer, Visual question answering BibRef

Farazi, M.R.[Moshiur R.], Khan, S.H.[Salman H.], Barnes, N.[Nick],
From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts,
IVC(103), 2020, pp. 103985.
Elsevier DOI 2011
Visual Question Answering, Deep learning, Natural language processing, Dataset bias BibRef

Terao, K.[Kento], Tamaki, T.[Toru], Raytchev, B.[Bisser], Kaneda, K.[Kazufumi], Satoh, S.[Shin'ichi],
Rephrasing Visual Questions by Specifying the Entropy of the Answer Distribution,
IEICE(E103-D), No. 11, November 2020, pp. 2362-2370.
WWW Link. 2011
BibRef

Yu, J.[Jing], Zhang, W.F.[Wei-Feng], Lu, Y.H.[Yu-Hang], Qin, Z.C.[Zeng-Chang], Hu, Y.[Yue], Tan, J.L.[Jian-Long], Wu, Q.[Qi],
Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval,
MultMed(22), No. 12, December 2020, pp. 3196-3209.
IEEE DOI 2011
Visualization, Cognition, Task analysis, Knowledge discovery, Semantics, Correlation, Information retrieval, cross-modal information retrieval BibRef

Lobry, S., Marcos, D., Murray, J., Tuia, D.,
RSVQA: Visual Question Answering for Remote Sensing Data,
GeoRS(58), No. 12, December 2020, pp. 8555-8566.
IEEE DOI 2012
Remote sensing, Task analysis, Visualization, Data models, Feature extraction, Knowledge discovery, visual question answering (VQA) BibRef

Faure, M.[Maxime], Lobry, S.[Sylvain], Kurtz, C.[Camille], Wendling, L.[Laurent],
Embedding Spatial Relations in Visual Question Answering for Remote Sensing,
ICPR22(310-316)
IEEE DOI 2212
Training, Visualization, Histograms, Feature extraction, Question answering (information retrieval), Spatial databases. BibRef

Chappuis, C.[Christel], Zermatten, V.[Valérie], Lobry, S.[Sylvain], Le Saux, B.[Bertrand], Tuia, D.[Devis],
Prompt-RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering,
EarthVision22(1371-1380)
IEEE DOI 2210
Training, Visualization, Natural languages, Feature extraction, Transformers, Question answering (information retrieval), Data mining BibRef

Sun, B.[Bo], Yao, Z.[Zeng], Zhang, Y.H.[Ying-Hui], Yu, L.J.[Le-Jun],
Local relation network with multilevel attention for visual question answering,
JVCIR(73), 2020, pp. 102762.
Elsevier DOI 2012
Visual question answering, Relation network, Attention mechanism BibRef

Wang, J.M.[Jian-Ming], Cui, E.[Enjie], Liu, K.L.[Kun-Liang], Sun, Y.K.[Yu-Kuan], Liang, J.Y.[Jia-Yu], Yuan, C.M.[Chun-Miao], Duan, X.J.[Xiao-Jie], Jin, G.H.[Guang-Hao], Chung, T.S.[Tae-Sun],
Referring expression comprehension model with matching detection and linguistic feedback,
IET-CV(14), No. 8, December 2020, pp. 625-633.
DOI Link 2012
BibRef

Li, X., Yuan, A., Lu, X.,
Vision-to-Language Tasks Based on Attributes and Attention Mechanism,
Cyber(51), No. 2, February 2021, pp. 913-926.
IEEE DOI 2101
Semantics, Task analysis, Visualization, Cats, Natural languages, Knowledge discovery, Feature extraction, Deep learning, visual question answering (VQA) BibRef

Cao, Q.X.[Qing-Xing], Liang, X.D.[Xiao-Dan], Li, B.L.[Bai-Lin], Lin, L.[Liang],
Interpretable Visual Question Answering by Reasoning on Dependency Trees,
PAMI(43), No. 3, March 2021, pp. 887-901.
IEEE DOI 2102
Cognition, Visualization, Layout, Logic gates, Task analysis, Knowledge discovery, Image coding, Visual question answering, attention model BibRef

Cao, Q.X.[Qing-Xing], Liang, X.D.[Xiao-Dan], Li, B.L.[Bai-Lin], Li, G., Lin, L.[Liang],
Visual Question Reasoning on General Dependency Tree,
CVPR18(7249-7257)
IEEE DOI 1812
Cognition, Visualization, Layout, Feature extraction, Task analysis, Collaboration, Neural networks BibRef

Shao, Y.[Yinan], Lin, J.C.W.[Jerry Chun-Wei], Srivastava, G.[Gautam], Jolfaei, A.[Alireza], Guo, D.D.[Dong-Dong], Hu, Y.[Yi],
Self-attention-based conditional random fields latent variables model for sequence labeling,
PRL(145), 2021, pp. 157-164.
Elsevier DOI 2104
Latent CRF, Sequence labeling, Encoding schema, Natural language processing, VQA, Big data BibRef

Zhong, H.S.[Hua-Song], Chen, J.Y.[Jing-Yuan], Shen, C.[Chen], Zhang, H.W.[Han-Wang], Huang, J.Q.[Jian-Qiang], Hua, X.S.[Xian-Sheng],
Self-Adaptive Neural Module Transformer for Visual Question Answering,
MultMed(23), 2021, pp. 1264-1273.
IEEE DOI 2105
Layout, Cognition, Task analysis, Visualization, Neural networks, Knowledge discovery, Decoding, Visual question answering, self-adaptive BibRef

Sharma, H.[Himanshu], Jalal, A.S.[Anand Singh],
Visual question answering model based on graph neural network and contextual attention,
IVC(110), 2021, pp. 104165.
Elsevier DOI 2106
Visual question answering, Natural language processing, Attention BibRef

Wu, Y.[Yirui], Ma, Y.T.[Yun-Tao], Wan, S.H.[Shao-Hua],
Multi-scale relation reasoning for multi-modal Visual Question Answering,
SP:IC(96), 2021, pp. 116319.
Elsevier DOI 2106
Multi-modal data, Visual Question Answering, Multi-scale relation reasoning, Attention model BibRef

Ma, Y.T.[Yun-Tao], Lu, T.[Tong], Wu, Y.[Yirui],
Multi-scale Relational Reasoning with Regional Attention for Visual Question Answering,
ICPR21(5642-5649)
IEEE DOI 2105
Visualization, Neural networks, Knowledge discovery, Cognition, Robustness, Data mining, Visual question learning, Attention, Multi-scale relational reasoning BibRef

dos S-Silva, F.H.[Francisco H.], Bezerra, G.M.[Gabriel M.], Holanda, G.B.[Gabriel B.], de Souza, J.W.M.[J. Wellington M.], Rego, P.A.L.[Paulo A.L.], Lira Neto, A.V.[Aloísio V.], de Albuquerque, V.H.C.[Victor Hugo C.], Rebouças Filho, P.P.[Pedro P.],
A novel feature extractor for human action recognition in visual question answering,
PRL(147), 2021, pp. 41-47.
Elsevier DOI 2106
BibRef

Guo, W.[Wenya], Zhang, Y.[Ying], Yang, J.F.[Ju-Feng], Yuan, X.J.[Xiao-Jie],
Re-Attention for Visual Question Answering,
IP(30), 2021, pp. 6730-6743.
IEEE DOI 2108
Visualization, Tires, Task analysis, Feature extraction, Training, Knowledge discovery, Image reconstruction, gating mechanism BibRef

Hu, J.[Jun], Qian, S.[Shengsheng], Fang, Q.[Quan], Xu, C.S.[Chang-Sheng],
Heterogeneous Community Question Answering via Social-Aware Multi-Modal Co-Attention Convolutional Matching,
MultMed(23), 2021, pp. 2321-2334.
IEEE DOI 2108
Visualization, Semantics, Knowledge discovery, Context modeling, Portable computers, Task analysis, Object detection, social multimedia BibRef

Farazi, M.[Moshiur], Khan, S.[Salman], Barnes, N.[Nick],
Accuracy vs. complexity: A trade-off in visual question answering models,
PR(120), 2021, pp. 108106.
Elsevier DOI 2109
Visual question answering, Visual feature extraction, Language features, Multi-modal fusion, Speed-accuracy trade-off BibRef

Zheng, W.F.[Wen-Feng], Yin, L.R.[Li-Rong], Chen, X.B.[Xia-Bing], Ma, Z.[Zhiyang], Liu, S.[Shan], Yang, B.[Bo],
Knowledge base graph embedding module design for Visual question answering model,
PR(120), 2021, pp. 108153.
Elsevier DOI 2109
Faster R-CNN, DBpedia spotlight, knowledge base, VQA BibRef

Barra, S.[Silvio], Bisogni, C.[Carmen], de Marsico, M.[Maria], Ricciardi, S.[Stefano],
Visual question answering: Which investigated applications?,
PRL(151), 2021, pp. 325-331.
Elsevier DOI 2110
Visual question answering, Real-world VQA, VQA for medical applicatons, VQA for assistive applications, VQA in cultural heritage and education BibRef

Manmadhan, S.[Sruthy], Kovoor, B.C.[Binsu C.],
Multi-Tier Attention Network using Term-weighted Question Features for Visual Question Answering,
IVC(115), 2021, pp. 104291.
Elsevier DOI 2110
Attention mechanism, Deep learning, Semantic similarity, Supervised term weighting, Visual Question Answering BibRef

Liu, A.A.[An-An], Lu, Z.[Zimu], Xu, N.[Ning], Nie, W.Z.[Wei-Zhi], Li, W.H.[Wen-Hui],
Multi-type decision fusion network for visual Q&A,
IVC(115), 2021, pp. 104281.
Elsevier DOI 2110
Visual question answering, Multi-type question, Scene graph BibRef

Patro, B.N.[Badri N.], Kurmi, V.K.[Vinod K.], Kumar, S.[Sandeep], Namboodiri, V.P.[Vinay P.],
MUMC: Minimizing uncertainty of mixture of cues,
IVC(115), 2021, pp. 104280.
Elsevier DOI 2110
Uncertainty estimation, Mixture of cues, Visual Question Answering, Paraphrase, Encoder-decoder BibRef

Liu, F.[Fei], Liu, J.[Jing], Fang, Z.W.[Zhi-Wei], Hong, R.C.[Ri-Chang], Lu, H.Q.[Han-Qing],
Visual Question Answering With Dense Inter- and Intra-Modality Interactions,
MultMed(23), 2021, pp. 3518-3529.
IEEE DOI 2110
Visualization, Knowledge discovery, Connectors, Encoding, Task analysis, Image coding, Stacking, Visual question answering, dense interactions BibRef

Wu, J.J.[Jia-Jia], Du, J.[Jun], Wang, F.[Fengren], Yang, C.[Chen], Jiang, X.Z.[Xin-Zhe], Hu, J.[Jinshui], Yin, B.[Bing], Zhang, J.S.[Jian-Shu], Dai, L.R.[Li-Rong],
A multimodal attention fusion network with a dynamic vocabulary for TextVQA,
PR(122), 2022, pp. 108214.
Elsevier DOI 2112
Dynamic vocabulary, Attention map, Multimodal fusion, ST-VQA BibRef

Narayanan, A.[Abhishek], Rao, A.[Abijna], Prasad, A.[Abhishek], Natarajan, S.,
VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering,
IVC(116), 2021, pp. 104328.
Elsevier DOI 2112
Visual question answering, Factoid question answering, Knowledge based reasoning, Explainable VQA BibRef

Guo, Y.Y.[Yang-Yang], Nie, L.Q.[Li-Qiang], Cheng, Z.Y.[Zhi-Yong], Tian, Q.[Qi], Zhang, M.[Min],
Loss Re-Scaling VQA: Revisiting the Language Prior Problem From a Class-Imbalance View,
IP(31), 2022, pp. 227-238.
IEEE DOI 2112
Visualization, Training, Computational modeling, Benchmark testing, Predictive models, Cognition, Task analysis, loss re-scaling BibRef

Peng, L.[Liang], Yang, Y.[Yang], Wang, Z.[Zheng], Huang, Z.[Zi], Shen, H.T.[Heng Tao],
MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network,
PAMI(44), No. 1, January 2022, pp. 318-329.
IEEE DOI 2112
Visualization, Feature extraction, Semantics, Knowledge discovery, Cognition, Task analysis, Natural languages, relation attention BibRef

Manogaran, G.[Gunasekaran], Shakeel, P.M.[P. Mohamed], Burhanuddin, M.A., Baskar, S., Saravanan, V.[Vijayalakshmi], Crespo, R.G.[Rubén González], Martínez, O.S.[Oscar Sanjuán],
ADCCF: Adaptive deep concatenation coder framework for visual question answering,
PRL(152), 2021, pp. 348-355.
Elsevier DOI 2112
BibRef

Zhou, Y.[Yiyi], Ji, R.R.[Rong-Rong], Sun, X.S.[Xiao-Shuai], Su, J.S.[Jin-Song], Meng, D.Y.[De-Yu], Gao, Y.[Yue], Shen, C.H.[Chun-Hua],
Plenty is Plague: Fine-Grained Learning for Visual Question Answering,
PAMI(44), No. 2, February 2022, pp. 697-709.
IEEE DOI 2201
Training, Visualization, Knowledge discovery, Redundancy, Data models, Feature extraction, Training data, visual question answering BibRef

E, W.N.[Wei-Nan], Zhou, Y.[Yajun],
A Mathematical Model for Universal Semantics,
PAMI(44), No. 3, March 2022, pp. 1124-1132.
IEEE DOI 2202
Semantics, Numerical models, Pattern analysis, Markov processes, Statistical analysis, Exponential distribution, question answering BibRef

Li, X.P.[Xiang-Peng], Wu, B.[Bo], Song, J.K.[Jing-Kuan], Gao, L.L.[Lian-Li], Zeng, P.P.[Peng-Peng], Gan, C.[Chuang],
Text-instance graph: Exploring the relational semantics for text-based visual question answering,
PR(124), 2022, pp. 108455.
Elsevier DOI 2203
Text-based visual question answering, Spatial overlapping, Text-Instance graph, Copy mechanism BibRef

Shao, X.J.[Xiang-Jun], Xiang, Z.L.[Zheng-Long], Li, Y.X.[Yuan-Xiang],
Visual question answering with gated relation-aware auxiliary,
IET-IPR(16), No. 5, 2022, pp. 1424-1432.
DOI Link 2203
BibRef

Liu, Y.[Yun], Zhang, X.M.[Xiao-Ming], Zhao, Z.Y.[Zhi-Yun], Zhang, B.[Bo], Cheng, L.[Lei], Li, Z.J.[Zhou-Jun],
ALSA: Adversarial Learning of Supervised Attentions for Visual Question Answering,
Cyber(52), No. 6, June 2022, pp. 4520-4533.
IEEE DOI 2207
Visualization, Correlation, Generators, Feature extraction, Task analysis, Knowledge discovery, Fuses, Adversarial learning, visual question answering (VQA) BibRef

Ouyang, N.L.[Ning-Lin], Huang, Q.B.[Qing-Bao], Li, P.J.[Pi-Jian], Cai, Y.[Yi], Liu, B.[Bin], Leung, H.F.[Ho-Fung], Li, Q.[Qing],
Suppressing Biased Samples for Robust VQA,
MultMed(24), 2022, pp. 3405-3415.
IEEE DOI 2207
Training, Visualization, Training data, Image color analysis, Sports, Knowledge discovery, Annotations, Visual Question Answering, Robust VQA BibRef

Shuang, K.[Kai], Guo, J.[Jinyu], Wang, Z.[Zihan],
Comprehensive-perception dynamic reasoning for visual question answering,
PR(131), 2022, pp. 108878.
Elsevier DOI 2208
Cross-modal information fusion, Visual question answering, Comprehensive perception, Relational reasoning BibRef

Gouthaman, K.V., Mittal, A.[Anurag],
On the role of question encoder sequence model in robust visual question answering,
PR(131), 2022, pp. 108883.
Elsevier DOI 2208
Visual question answering, Out-of-distribution performance, Gated recurrent unit, Transformer, Graph attention network BibRef

Zhou, K.Y.[Kai-Yang], Yang, J.K.[Jing-Kang], Loy, C.C.[Chen Change], Liu, Z.W.[Zi-Wei],
Learning to Prompt for Vision-Language Models,
IJCV(130), No. 9, September 2022, pp. 2337-2348.
Springer DOI 2208
BibRef

Zhou, K.Y.[Kai-Yang], Yang, J.K.[Jing-Kang], Loy, C.C.[Chen Change], Liu, Z.[Ziwei],
Conditional Prompt Learning for Vision-Language Models,
CVPR22(16795-16804)
IEEE DOI 2210
Training, Representation learning, Adaptation models, Neural networks, Manuals, Market research, Representation learning BibRef

Chen, C.Q.[Chong-Qing], Han, D.Z.[De-Zhi], Chang, C.C.[Chin-Chen],
CAAN: Context-Aware attention network for visual question answering,
PR(132), 2022, pp. 108980.
Elsevier DOI 2209
Visual question answering, Attention mechanism, Understanding bias, Absolute position, Contextual information BibRef

Song, L.Y.[Ling-Yun], Li, J.[Jianao], Liu, J.[Jun], Yang, Y.[Yang], Shang, X.[Xuequn], Sun, M.X.[Ming-Xuan],
Answering knowledge-based visual questions via the exploration of Question Purpose,
PR(133), 2023, pp. 109015.
Elsevier DOI 2210
Visual question answering, DNN, Question Purpose BibRef

Xie, J.Y.[Jia-Yuan], Fang, W.H.[Wen-Hao], Cai, Y.[Yi], Huang, Q.B.[Qing-Bao], Li, Q.[Qing],
Knowledge-Based Visual Question Generation,
CirSysVideo(32), No. 11, November 2022, pp. 7547-7558.
IEEE DOI 2211
Visualization, Feature extraction, Task analysis, Knowledge based systems, Knowledge representation, Decoding, multimodal BibRef

Gao, C.[Chenyu], Zhu, Q.[Qi], Wang, P.[Peng], Li, H.[Hui], Liu, Y.L.[Yu-Liang], van den Hengel, A.J.[Anton J.], Wu, Q.[Qi],
Structured Multimodal Attentions for TextVQA,
PAMI(44), No. 12, December 2022, pp. 9603-9614.
IEEE DOI 2212
Optical character recognition software, Cognition, Visualization, Text recognition, Task analysis, Knowledge discovery, Annotations, transformer BibRef

Jin, Z.X.[Zan-Xia], Wu, H.[Heran], Yang, C.[Chun], Zhou, F.[Fang], Qin, J.Y.[Jing-Yan], Xiao, L.[Lei], Yin, X.C.[Xu-Cheng],
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering,
MultMed(25), 2023, pp. 1-12.
IEEE DOI 2301
Optical character recognition software, Semantics, Visualization, Cognition, Knowledge discovery, Task analysis, Attention mechanism, visual question answering BibRef

Beckham, C.[Christopher], Weiss, M.[Martin], Golemo, F.[Florian], Honari, S.[Sina], Nowrouzezahrai, D.[Derek], Pal, C.[Christopher],
Visual question answering from another perspective: CLEVR mental rotation tests,
PR(136), 2023, pp. 109209.
Elsevier DOI 2301
Deep learning, Computer vision, Visual question answering, Contrastive learning, Clevr BibRef


Etesam, Y.[Yasaman], Kochiev, L.[Leon], Chang, A.X.[Angel X.],
3DVQA: Visual Question Answering for 3D Environments,
CRV22(233-240)
IEEE DOI 2301
Point cloud compression, Surface reconstruction, Lighting, Question answering (information retrieval), Noise measurement, 3D BibRef

Haisa, G.[Gulizada], Altenbek, G.[Gulila],
Question Classification Based on Weak Supervision and Interrogative Pronouns Attention Mechanism,
ICPR22(2273-2278)
IEEE DOI 2212
Deep learning, Dictionaries, Costs, Annotations, Neural networks, Feature extraction, Question answering (information retrieval), Kazakh BibRef

Ramamurthy, P.[Priyadharsini], Aakur, S.N.[Sathyanarayanan N.],
ISD-QA: Iterative Distillation of Commonsense Knowledge from General Language Models for Unsupervised Question Answering,
ICPR22(1229-1235)
IEEE DOI 2212
Transfer learning, Training data, Question answering (information retrieval), Data models, Iterative methods BibRef

Zhang, H.[Haotian], Wu, W.[Wei],
CAT: Re-Conv Attention in Transformer for Visual Question Answering,
ICPR22(1471-1477)
IEEE DOI 2212
Representation learning, Visualization, Predictive models, Performance gain, Transformers, Feature extraction, Multi-modal task BibRef

Liu, L.[Lei], Su, X.D.[Xiang-Dong], Guo, H.[Hui], Zhu, D.[Daobin],
A Transformer-based Medical Visual Question Answering Model,
ICPR22(1712-1718)
IEEE DOI 2212
Training, Visualization, Transformers, Feature extraction, Question answering (information retrieval), Stability analysis, Data mining BibRef

Boecking, B.[Benedikt], Usuyama, N.[Naoto], Bannur, S.[Shruthi], Castro, D.C.[Daniel C.], Schwaighofer, A.[Anton], Hyland, S.[Stephanie], Wetscherek, M.[Maria], Naumann, T.[Tristan], Nori, A.[Aditya], Alvarez-Valle, J.[Javier], Poon, H.[Hoifung], Oktay, O.[Ozan],
Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing,
ECCV22(XXXVI:1-21).
Springer DOI 2211
BibRef

Cui, Q.[Quan], Zhou, B.[Boyan], Guo, Y.[Yu], Yin, W.D.[Wei-Dong], Wu, H.[Hao], Yoshie, O.[Osamu], Chen, Y.[Yubo],
Contrastive Vision-Language Pre-training with Limited Resources,
ECCV22(XXXVI:236-253).
Springer DOI 2211
BibRef

Wu, X.Y.[Xiang-Yu], Lu, J.F.[Jian-Feng], Li, Z.F.[Zhuan-Feng], Xiong, F.C.[Feng-Chao],
Ques-to-Visual Guided Visual Question Answering,
ICIP22(4193-4197)
IEEE DOI 2211
Location awareness, Visualization, Fuses, Semantics, Benchmark testing, Question answering (information retrieval), channel attention BibRef

Sarkar, A.[Argho], Rahnemoonfar, M.[Maryam],
Grad-Cam Aware Supervised Attention for Visual Question Answering for Post-Disaster Damage Assessment,
ICIP22(3783-3787)
IEEE DOI 2211
Training, Visualization, Annotations, Pipelines, Question answering (information retrieval), Hurricanes, Grad-Cam BibRef

Whitehead, S.[Spencer], Petryk, S.[Suzanne], Shakib, V.[Vedaad], Gonzalez, J.[Joseph], Darrell, T.J.[Trevor J.], Rohrbach, A.[Anna], Rohrbach, M.[Marcus],
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly,
ECCV22(XXXVI:148-166).
Springer DOI 2211
BibRef

Chen, L.[Long], Zheng, Y.H.[Yu-Hang], Xiao, J.[Jun],
Rethinking Data Augmentation for Robust Visual Question Answering,
ECCV22(XXXVI:95-112).
Springer DOI 2211
BibRef

Zhang, H.T.[Hao-Tian], Wu, W.[Wei],
Context Relation Fusion Model for Visual Question Answering,
ICIP22(2112-2116)
IEEE DOI 2211
Visualization, Question answering (information retrieval), Task analysis, Context modeling, Visual question answering, language bias BibRef

Biten, A.F.[Ali Furkan], Litman, R.[Ron], Xie, Y.S.[Yu-Sheng], Appalaraju, S.[Srikar], Manmatha, R.,
LaTr: Layout-Aware Transformer for Scene-Text VQA,
CVPR22(16527-16537)
IEEE DOI 2210
Training, Symbiosis, Visualization, Vocabulary, Layout, Transformers, Feature extraction, Vision + language, Scene analysis and understanding BibRef

Lu, J.Y.[Jia-Ying], Ye, X.[Xin], Ren, Y.[Yi], Yang, Y.Z.[Ye-Zhou],
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning,
ODRUM22(4917-4926)
IEEE DOI 2210
Training, Visualization, Computational modeling, Knowledge based systems, Training data, Reinforcement learning, Data models BibRef

Nguyen, B.X.[Binh X.], Do, T.[Tuong], Tran, H.[Huy], Tjiputra, E.[Erman], Tran, Q.D.[Quang D.], Nguyen, A.[Anh],
Coarse-to-Fine Reasoning for Visual Question Answering,
MULA22(4557-4565)
IEEE DOI 2210
Deep learning, Visualization, Codes, Semantics, Neural networks, Feature extraction, Cognition BibRef

Ding, Y.H.[Yi-Hao], Huang, Z.[Zhe], Wang, R.[Runlin], Zhang, Y.H.[Yan-Hang], Chen, X.[Xianru], Ma, Y.Z.[Yu-Zhong], Chung, H.[Hyunsuk], Han, S.C.[Soyeon Caren],
V-Doc: Visual questions answers with Documents,
CVPR22(21460-21466)
IEEE DOI 2210
Deep learning, Visualization, Computational modeling, Predictive models, Portable document format, Question answering (information retrieval) BibRef

Azuma, D.[Daichi], Miyanishi, T.[Taiki], Kurita, S.H.[Shu-Hei], Kawanabe, M.[Motoaki],
ScanQA: 3D Question Answering for Spatial Scene Understanding,
CVPR22(19107-19117)
IEEE DOI 2210
Location awareness, Measurement, Solid modeling, Visualization, Question answering (information retrieval), Vision + language, Scene analysis and understanding BibRef

Li, G.Y.[Guang-Yao], Wei, Y.[Yake], Tian, Y.[Yapeng], Xu, C.L.[Chen-Liang], Wen, J.R.[Ji-Rong], Hu, D.[Di],
Learning to Answer Questions in Dynamic Audio-Visual Scenarios,
CVPR22(19086-19096)
IEEE DOI 2210
Visualization, Image analysis, Codes, Computational modeling, Cognition, Question answering (information retrieval), Vision + language BibRef

Chen, C.[Chongyan], Anjum, S.[Samreen], Gurari, D.[Danna],
Grounding Answers for Visual Questions Asked by Visually Impaired People,
CVPR22(19076-19085)
IEEE DOI 2210
Visualization, Correlation, Grounding, Text recognition, Computational modeling, Visual impairment, Vision + language BibRef

Guo, X.Y.[Xiao-Yuan], Duan, J.L.[Jia-Li], Kuo, C.C.J.[C.C. Jay], Gichoya, J.W.[Judy Wawira], Banerjee, I.[Imon],
Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics,
ICPR22(4779-4785)
IEEE DOI 2212
Representation learning, Bridges, Visualization, Vocabulary, Semantics, Buildings, Benchmark testing BibRef

Yang, J.[Jinyu], Duan, J.L.[Jia-Li], Tran, S.[Son], Xu, Y.[Yi], Chanda, S.[Sampath], Chen, L.Q.[Li-Qun], Zeng, B.[Belinda], Chilimbi, T.[Trishul], Huang, J.Z.[Jun-Zhou],
Vision-Language Pre-Training with Triple Contrastive Learning,
CVPR22(15650-15659)
IEEE DOI 2210
Representation learning, Visualization, Question answering (information retrieval), Self- semi- meta- unsupervised learning BibRef

Walmer, M.[Matthew], Sikka, K.[Karan], Sur, I.[Indranil], Shrivastava, A.[Abhinav], Jha, S.[Susmit],
Dual-Key Multimodal Backdoors for Visual Question Answering,
CVPR22(15354-15364)
IEEE DOI 2210
Visualization, Training data, Detectors, Feature extraction, Question answering (information retrieval), Vision + language BibRef

Jing, C.C.[Chen-Chen], Jia, Y.D.[Yun-De], Wu, Y.W.[Yu-Wei], Liu, X.Y.[Xin-Yu], Wu, Q.[Qi],
Maintaining Reasoning Consistency in Compositional Visual Question Answering,
CVPR22(5089-5098)
IEEE DOI 2210
Visualization, Birds, Cognition, Question answering (information retrieval), Visual reasoning BibRef

Ding, Y.[Yang], Yu, J.[Jing], Liu, B.[Bang], Hu, Y.[Yue], Cui, M.X.[Ming-Xin], Wu, Q.[Qi],
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering,
CVPR22(5079-5088)
IEEE DOI 2210
Bridges, Visualization, Codes, Computational modeling, Knowledge based systems, Semantics, Vision + language BibRef

Cascante-Bonilla, P.[Paola], Wu, H.[Hui], Wang, L.[Letao], Feris, R.[Rogerio], Ordonez, V.[Vicente],
Sim VQA: Exploring Simulated Environments for Visual Question Answering,
CVPR22(5046-5056)
IEEE DOI 2210
Training, Visualization, Solid modeling, Computational modeling, Pipelines, Switches, Vision + language, Visual reasoning BibRef

Gao, F.[Feng], Ping, Q.[Qing], Thattai, G.[Govind], Reganti, A.[Aishwarya], Wu, Y.N.[Ying Nian], Natarajan, P.[Prem],
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering,
CVPR22(5057-5067)
IEEE DOI 2210
Knowledge engineering, Visualization, Solid modeling, Knowledge based systems, Natural languages, Transforms, Visual reasoning BibRef

Gupta, V.[Vipul], Li, Z.[Zhuowan], Kortylewski, A.[Adam], Zhang, C.[Chenyu], Li, Y.[Yingwei], Yuille, A.L.[Alan L.],
SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering,
CVPR22(5068-5078)
IEEE DOI 2210
Training, Visualization, Perturbation methods, Computational modeling, Predictive models, Robustness, Visual reasoning BibRef

Aflalo, E.[Estelle], Du, M.[Meng], Tseng, S.Y.[Shao-Yen], Liu, Y.F.[Yong-Fei], Wu, C.[Chenfei], Duan, N.[Nan], Lal, V.[Vasudev],
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers,
CVPR22(21374-21383)
IEEE DOI 2210
Heating systems, Visualization, Machine vision, Computational modeling, Transformers, Question answering (information retrieval) BibRef

Burghouts, G.J.[Gertjan J.], Huizinga, W.[Wyke],
Coarse-to-Fine Visual Question Answering by Iterative, Conditional Refinement,
CIAP22(II:418-428).
Springer DOI 2205
BibRef

Li, Z.W.[Zhuo-Wan], Stengel-Eskin, E.[Elias], Zhang, Y.X.[Yi-Xiao], Xie, C.[Cihang], Tran, Q.[Quan], van Durme, B.[Benjamin], Yuille, A.L.[Alan L.],
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images,
ICCV21(14890-14899)
IEEE DOI 2203
Visualization, Analytical models, Codes, Computational modeling, Cognition, Data models, Vision + language BibRef

Kant, Y.[Yash], Moudgil, A.[Abhinav], Batra, D.[Dhruv], Parikh, D.[Devi], Agrawal, H.[Harsh],
Contrast and Classify: Training Robust VQA Models,
ICCV21(1584-1593)
IEEE DOI 2203
Training, Visualization, Perturbation methods, Linguistics, Benchmark testing, Boosting, Vision + language, BibRef

Han, X.Z.[Xin-Zhe], Wang, S.H.[Shu-Hui], Su, C.[Chi], Huang, Q.M.[Qing-Ming], Tian, Q.[Qi],
Greedy Gradient Ensemble for Robust Visual Question Answering,
ICCV21(1564-1573)
IEEE DOI 2203
Visualization, Analytical models, Annotations, Computational modeling, Feature extraction, Data models, BibRef

Dancette, C.[Corentin], Cadène, R.[Rémi], Teney, D.[Damien], Cord, M.[Matthieu],
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering,
ICCV21(1554-1563)
IEEE DOI 2203
Training, Visualization, Protocols, Codes, Image color analysis, Computational modeling, Vision + language, Explainable AI, Visual reasoning and logical representation BibRef

Zhou, Y.[Yiyi], Ren, T.[Tianhe], Zhu, C.Y.[Chao-Yang], Sun, X.S.[Xiao-Shuai], Liu, J.Z.[Jian-Zhuang], Ding, X.H.[Xing-Hao], Xu, M.L.[Ming-Liang], Ji, R.R.[Rong-Rong],
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering,
ICCV21(2054-2064)
IEEE DOI 2203
Visualization, Schedules, Computational modeling, Transforms, Benchmark testing, Performance gain, Transformers, BibRef

Yang, X.[Xu], Gao, C.Y.[Chong-Yang], Zhang, H.W.[Han-Wang], Cai, J.F.[Jian-Fei],
Auto-Parsing Network for Image Captioning and Visual Question Answering,
ICCV21(2177-2187)
IEEE DOI 2203
Training, Visualization, Graphical models, Stacking, Probability, Transformers, Vision + language, BibRef

Banerjee, P.[Pratyay], Gokhale, T.[Tejas], Yang, Y.Z.[Ye-Zhou], Baral, C.[Chitta],
Weakly Supervised Relative Spatial Reasoning for Visual Question Answering,
ICCV21(1888-1898)
IEEE DOI 2203
Geometry, Visualization, Grounding, Semantics, Estimation, Predictive models, Vision + language, Visual reasoning and logical representation BibRef

Cao, Q.X.[Qing-Xing], Wan, W.T.[Wen-Tao], Wang, K.[Keze], Liang, X.D.[Xiao-Dan], Lin, L.[Liang],
Linguistically Routing Capsule Network for Out-of-distribution Visual Question Answering,
ICCV21(1594-1603)
IEEE DOI 2203
Visualization, Correlation, Fuses, Computational modeling, Merging, Training data, Vision + language, BibRef

Li, L.J.[Lin-Jie], Lei, J.[Jie], Gan, Z.[Zhe], Liu, J.J.[Jing-Jing],
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models,
ICCV21(2022-2031)
IEEE DOI 2203
Training, Visualization, Analytical models, Computational modeling, Benchmark testing, Robustness, Vision + language, BibRef

Askarian, N.[Narjes], Abbasnejad, E.[Ehsan], Zukerman, I.[Ingrid], Buntine, W.[Wray], Haffari, G.[Gholamreza],
Inductive Biases for Low Data VQA: A Data Augmentation Approach,
Novelty22(231-240)
IEEE DOI 2202
Training, Visualization, Conferences, Natural languages, Image annotation, Data models BibRef

Mathew, M.[Minesh], Bagal, V.[Viraj], Tito, R.[Rubèn], Karatzas, D.[Dimosthenis], Valveny, E.[Ernest], Jawahar, C.V.,
InfographicVQA,
WACV22(2582-2591)
IEEE DOI 2202
Visualization, Computational modeling, Layout, Data visualization, Benchmark testing, Brain modeling, Vision and Languages BibRef

Kumar, S.[Sumit], Patro, B.N.[Badri N.], Namboodiri, V.P.[Vinay P.],
Auto QA: The Question Is Not Only What, but Also Where,
Novelty22(272-281)
IEEE DOI 2202
Location awareness, Visualization, Laser radar, Conferences, Semantics, Sensor systems BibRef

Kolling, C.[Camila], More, M.[Martin], Gavenski, N.[Nathan], Pooch, E.[Eduardo], Parraga, O.[Otávio], Barros, R.C.[Rodrigo C.],
Efficient Counterfactual Debiasing for Visual Question Answering,
WACV22(2572-2581)
IEEE DOI 2202
Training, Visualization, Frequency synthesizers, Correlation, Grounding, Computational modeling, Synthesizers, Analysis and Understanding BibRef

Jung, S.J.[Seung-Jun], Byun, J.[Junyoung], Shim, K.[Kyujin], Hwang, S.Y.[Sangh-Yun], Kim, C.[Changick],
Understanding VQA for Negative Answers Through Visual and Linguistic Inference,
ICIP21(2873-2877)
IEEE DOI 2201
Visualization, Image processing, Linguistics, Knowledge discovery, Inference algorithms, Reliability, Image Captioning, Constrained Beam Search BibRef

Felix, R.[Rafael], Repasky, B.[Boris], Hodge, S.[Samuel], Zolfaghari, R.[Reza], Abbasnejad, E.[Ehsan], Sherrah, J.[Jamie],
Cross-Modal Visual Question Answering for Remote Sensing Data: the International Conference on Digital Image Computing: Techniques and Applications (DICTA 2021),
DICTA21(1-9)
IEEE DOI 2201
Earth, Visualization, Satellites, Digital images, Natural languages, Machine learning, Transformers, Visual Question Answering, OpenStreetMap BibRef

Le, T.[Tung], Nguyen, H.T.[Huy Tien], Nguyen, M.L.[Minh Le],
Vision and Text Transformer for Predicting Answerability on Visual Question Answering,
ICIP21(934-938)
IEEE DOI 2201
Visualization, Image processing, Predictive models, Knowledge discovery, Robustness, Task analysis, Answerability, Multi-head Attention BibRef

Huang, Z.Q.[Zi-Qi], Zhu, H.Y.[Hong-Yuan], Sun, Y.[Ying], Choi, D.[Dongkyu], Tan, C.[Cheston], Lim, J.H.[Joo-Hwee],
A Diagnostic Study of Visual Question Answering With Analogical Reasoning,
ICIP21(2463-2467)
IEEE DOI 2201
Location awareness, Visualization, Image processing, Natural languages, Benchmark testing, Tools, Knowledge discovery, benchmark BibRef

Chen, H.Y.[Hong-Yu], Liu, R.F.[Rui-Fang], Peng, B.[Bo],
Cross-modal Relational Reasoning Network for Visual Question Answering,
MAIR2-21(3939-3948)
IEEE DOI 2112
Bridges, Visualization, Semantics, Knowledge discovery, Linear programming BibRef

Wang, Z.X.[Zi-Xu], Miao, Y.[Yishu], Specia, L.[Lucia],
Latent Variable Models for Visual Question Answering,
CLVL21(3137-3141)
IEEE DOI 2112
Training, Visualization, Computer aided instruction, Benchmark testing, Knowledge discovery BibRef

Hirota, Y.[Yusuke], Garcia, N.[Noa], Otani, M.[Mayu], Chu, C.[Chenhui], Nakashima, Y.[Yuta], Taniguchi, I.[Ittetsu], Onoye, T.[Takao],
Visual Question Answering with Textual Representations for Images,
CLVL21(3147-3150)
IEEE DOI 2112
Visualization, Computational modeling, Knowledge discovery, Feature extraction, Object recognition BibRef

Ye, K.[Keren], Kovashka, A.[Adriana],
Linguistic Structures as Weak Supervision for Visual Scene Graph Generation,
CVPR21(8285-8295)
IEEE DOI 2111
Location awareness, Visualization, Blogs, Linguistics, Pattern recognition, Noise measurement BibRef

Yang, X.[Xu], Zhang, H.[Hanwang], Qi, G.J.[Guo-Jun], Cai, J.F.[Jian-Fei],
Causal Attention for Vision-Language Tasks,
CVPR21(9842-9852)
IEEE DOI 2111
Correlation, Codes, Computational modeling, Training data, Transformers, Data models BibRef

Xiao, J.B.[Jun-Bin], Shang, X.[Xindi], Yao, A.[Angela], Chua, T.S.[Tat-Seng],
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions,
CVPR21(9772-9781)
IEEE DOI 2111
Adaptation models, Benchmark testing, Knowledge discovery, Cognition, Pattern recognition, Task analysis BibRef

Chen, X.Y.[Xian-Yu], Jiang, M.[Ming], Zhao, Q.[Qi],
Predicting Human Scanpaths in Visual Question Answering,
CVPR21(10871-10880)
IEEE DOI 2111
Training, Visualization, Reinforcement learning, Predictive models, Tools, Knowledge discovery BibRef

Qi, Y.G.[Yong-Gang], Zhang, K.[Kai], Sain, A.[Aneeshan], Song, Y.Z.[Yi-Zhe],
PQA: Perceptual Question Answering,
CVPR21(12051-12059)
IEEE DOI 2111
Visualization, Training data, Psychology, Organizations, Visual systems, Knowledge discovery, Data models BibRef

Yuan, Y.Y.[Yuan-Yuan], Wang, S.[Shuai], Jiang, M.Y.[Ming-Yue], Chen, T.Y.[Tsong Yueh],
Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing,
CVPR21(16903-16912)
IEEE DOI 2111
Visualization, Computational modeling, Transforms, Benchmark testing, Knowledge discovery, Cognition BibRef

Marino, K.[Kenneth], Chen, X.L.[Xin-Lei], Parikh, D.[Devi], Gupta, A.[Abhinav], Rohrbach, M.[Marcus],
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA,
CVPR21(14106-14116)
IEEE DOI 2111
Training, Vocabulary, Knowledge based systems, Semantics, Training data, Knowledge representation, Predictive models BibRef

Niu, Y.[Yulei], Tang, K.[Kaihua], Zhang, H.[Hanwang], Lu, Z.W.[Zhi-Wu], Hua, X.S.[Xian-Sheng], Wen, J.R.[Ji-Rong],
Counterfactual VQA: A Cause-Effect Look at Language Bias,
CVPR21(12695-12705)
IEEE DOI 2111
Codes, Linguistics, Robustness, Cognition, Pattern recognition BibRef

Yang, Z.Y.[Zheng-Yuan], Lu, Y.J.[Yi-Juan], Wang, J.F.[Jian-Feng], Yin, X.[Xi], Florencio, D.[Dinei], Wang, L.J.[Li-Juan], Zhang, C.[Cha], Zhang, L.[Lei], Luo, J.B.[Jie-Bo],
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption,
CVPR21(8747-8757)
IEEE DOI 2111
Visualization, Training data, Predictive models, Knowledge discovery, Pattern recognition, Optical character recognition software BibRef

Kervadec, C.[Corentin], Jaunet, T.[Théo], Antipov, G.[Grigory], Baccouche, M.[Moez], Vuillemot, R.[Romain], Wolf, C.[Christian],
How Transferable are Reasoning Patterns in VQA?,
CVPR21(4205-4214)
IEEE DOI 2111
Visualization, Analytical models, Data visualization, Tools, Transformers, Cognition, Data models BibRef

Kervadec, C.[Corentin], Antipov, G.[Grigory], Baccouche, M.[Moez], Wolf, C.[Christian],
Roses are Red, Violets are Blue… But Should VQA expect Them To?,
CVPR21(2775-2784)
IEEE DOI 2111
Training, Measurement, Visualization, Computational modeling, Benchmark testing, Knowledge discovery BibRef

Cho, J.W.[Jae Won], Kim, D.J.[Dong-Jin], Choi, J.[Jinsoo], Jung, Y.[Yunjae], Kweon, I.S.[In So],
Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation,
MULA21(1592-1601)
IEEE DOI 2109
Visualization, Knowledge discovery, Pattern recognition, Task analysis, Bars BibRef

Dua, R.[Radhika], Kancheti, S.S.[Sai Srinivas], Balasubramanian, V.N.[Vineeth N],
Beyond VQA: Generating Multi-word Answers and Rationales to Visual Questions,
MULA21(1623-1632)
IEEE DOI 2109
Deep learning, Visualization, Vocabulary, Computational modeling, Knowledge discovery BibRef

Rahman, T.[Tanzila], Chou, S.H.[Shih-Han], Sigal, L.[Leonid], Carenini, G.[Giuseppe],
An Improved Attention for Visual Question Answering,
MULA21(1653-1662)
IEEE DOI 2109
Visualization, Computational modeling, Natural languages, Logic gates BibRef

Stefanini, M.[Matteo], Cornia, M.[Marcella], Baraldi, L.[Lorenzo], Cucchiara, R.[Rita],
A Novel Attention-based Aggregation Function to Combine Vision and Language,
ICPR21(1212-1219)
IEEE DOI 2105
Deep learning, Visualization, Image retrieval, Transforms, Knowledge discovery BibRef

Jolly, S.[Shailza], Palacio, S.[Sebastian], Folz, J.[Joachim], Raue, F.[Federico], Hees, J.[Jörn], Dengel, A.[Andreas],
P ˜ NP, at least in Visual Question Answering,
ICPR21(2748-2754)
IEEE DOI 2105
Training, Visualization, Upper bound, Knowledge discovery, Pattern recognition BibRef

Liang, Y.Y.[Yao-Yuan], Wang, X.[Xin], Duan, X.G.[Xu-Guang], Zhu, W.W.[Wen-Wu],
Multi-modal Contextual Graph Neural Network for Text Visual Question Answering,
ICPR21(3491-3498)
IEEE DOI 2105
Visualization, Image recognition, Text recognition, Target recognition, Shape, Knowledge discovery, Graph neural networks BibRef

Farazi, M.[Moshiur], Khan, S.[Salman], Barnes, N.[Nick],
Question-Agnostic Attention for Visual Question Answering,
ICPR21(3542-3549)
IEEE DOI 2105
Training, Visualization, Image resolution, Preforms, Computational modeling, Semantics, Focusing, Multimodal Fusion BibRef

Li, Y.[Yanan], Lin, Y.[Yuetan], Zhao, H.H.[Hong-Hui], Wang, D.H.[Dong-Hui],
Dual Path Multi-Modal High-Order Features for Textual Content based Visual Question Answering,
ICPR21(4324-4331)
IEEE DOI 2105
Visualization, Image recognition, Image coding, Correlation, Text recognition, Fuses, Semantics BibRef

Mishra, A.[Aakansha], Anand, A.[Ashish], Guha, P.[Prithwijit],
Multi-stage Attention based Visual Question Answering,
ICPR21(9407-9414)
IEEE DOI 2105
Visualization, Analytical models, Bidirectional control, Benchmark testing, Knowledge discovery, Pattern recognition, Attention Network BibRef

Bozinis, T.[Theodoros], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Improving Visual Question Answering using Active Perception on Static Images,
ICPR21(879-884)
IEEE DOI 2105
Deep learning, Visualization, Analytical models, Image resolution, Active perception, Reinforcement learning, Knowledge discovery BibRef

Huang, H.T.[Han-Tao], Han, T.[Tao], Han, W.[Wei], Yap, D.[Deep], Chiang, C.M.[Cheng-Ming],
Answer-checking in Context: A Multi-modal Fully Attention Network for Visual Question Answering,
ICPR21(1173-1180)
IEEE DOI 2105
Visualization, Bit error rate, Image representation, Knowledge discovery, Pattern recognition BibRef

Sun, Q.[Qiang], Xie, B.H.[Bing-Hui], Fu, Y.W.[Yan-Wei],
Second Order Enhanced Multi-Glimpse Attention in Visual Question Answering,
ACCV20(IV:87-103).
Springer DOI 2103
BibRef

Goel, V.[Vatsal], Chandak, M.[Mohit], Anand, A.[Ashish], Guha, P.[Prithwijit],
IQ-VQA: Intelligent Visual Question Answering,
VTIUR20(357-370).
Springer DOI 2103
BibRef

Tan, S.[Sinan], Xiang, W.[Weilai], Liu, H.P.[Hua-Ping], Guo, D.[Di], Sun, F.C.[Fu-Chun],
Multi-agent Embodied Question Answering in Interactive Environments,
ECCV20(XIII:663-678).
Springer DOI 2011
BibRef

Qiao, Y., Yu, Z., Liu, J.,
VC-VQA: Visual Calibration Mechanism For Visual Question Answering,
ICIP20(1481-1485)
IEEE DOI 2011
Visualization, Image reconstruction, Calibration, Task analysis, Predictive models, Feature extraction, Knowledge discovery, Feature Reconstruction BibRef

Jain, V., Lodhavia, J.,
Automatic Question Tagging using k-Nearest Neighbors and Random Forest,
ISCV20(1-4)
IEEE DOI 2011
learning (artificial intelligence), question answering (information retrieval), Natural Language Processing BibRef

Tang, R.X.[Rui-Xue], Ma, C.[Chao], Zhang, W.E.[Wei Emma], Wu, Q.[Qi], Yang, X.K.[Xiao-Kang],
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering,
ECCV20(XIX:437-453).
Springer DOI 2011
BibRef

Gokhale, T.[Tejas], Banerjee, P.[Pratyay], Baral, C.[Chitta], Yang, Y.Z.[Ye-Zhou],
VQA-LOL: Visual Question Answering Under the Lens of Logic,
ECCV20(XXI:379-396).
Springer DOI 2011
BibRef

Yang, X.F.[Xiao-Feng], Lin, G.S.[Guo-Sheng], Lv, F.M.[Feng-Mao], Liu, F.Y.[Fa-Yao],
TRRNET: Tiered Relation Reasoning for Compositional Visual Question Answering,
ECCV20(XXI:414-430).
Springer DOI 2011
BibRef

Bansal, A.[Ankan], Zhang, Y.[Yuting], Chellappa, R.[Rama],
Visual Question Answering on Image Sets,
ECCV20(XXI:51-67).
Springer DOI 2011
BibRef

Han, X.Z.[Xin-Zhe], Wang, S.H.[Shu-Hui], Su, C.[Chi], Zhang, W.G.[Wei-Gang], Huang, Q.M.[Qing-Ming], Tian, Q.[Qi],
Interpretable Visual Reasoning via Probabilistic Formulation Under Natural Supervision,
ECCV20(IX:553-570).
Springer DOI 2011
BibRef

Kant, Y.[Yash], Batra, D.[Dhruv], Anderson, P.[Peter], Schwing, A.[Alexander], Parikh, D.[Devi], Lu, J.[Jiasen], Agrawal, H.[Harsh],
Spatially Aware Multimodal Transformers for TextVQA,
ECCV20(IX:715-732).
Springer DOI 2011
BibRef

Li, Q.[Qing], Huang, S.Y.[Si-Yuan], Hong, Y.[Yining], Zhu, S.C.[Song-Chun],
A Competence-aware Curriculum for Visual Concepts Learning via Question Answering,
ECCV20(II:141-157).
Springer DOI 2011
BibRef

Zheng, W.B.[Wen-Bo], Yan, L.[Lan], Gou, C.[Chao], Wang, F.Y.[Fei-Yue],
Webly Supervised Knowledge Embedding Model for Visual Reasoning,
CVPR20(12442-12451)
IEEE DOI 2008
Visual reasoning between visual image and natural language description. Visualization, Cognition, Knowledge based systems, Task analysis, Knowledge engineering, Modulation, Robustness BibRef

Wang, P.[Peng], Wu, Q.[Qi], Cao, J.W.[Jie-Wei], Shen, C.H.[Chun-Hua], Gao, L.L.[Lian-Li], van den Hengel, A.J.[Anton J.],
Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks,
CVPR19(1960-1968).
IEEE DOI 2002
BibRef

Bajaj, G., Bandyopadhyay, B., Schmidt, D., Maneriker, P., Myers, C., Parthasarathy, S.,
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing,
MVM20(1563-1566)
IEEE DOI 2008
Cognition, Training, Task analysis, Artificial intelligence, Global communication, Taxonomy, Semantics BibRef

Chen, L., Yan, X., Xiao, J., Zhang, H., Pu, S., Zhuang, Y.,
Counterfactual Samples Synthesizing for Robust Visual Question Answering,
CVPR20(10797-10806)
IEEE DOI 2008
Training, Cascading style sheets, Predictive models, Visualization, Image color analysis, Linguistics, Computational modeling BibRef

Vatashsky, B., Ullman, S.,
VQA With No Questions-Answers Training,
CVPR20(10373-10383)
IEEE DOI 2008
Visualization, Training, Image color analysis, Knowledge discovery, Boats, Image analysis, Task analysis BibRef

Jiang, H., Misra, I., Rohrbach, M., Learned-Miller, E.G., Chen, X.,
In Defense of Grid Features for Visual Question Answering,
CVPR20(10264-10273)
IEEE DOI 2008
Feature extraction, Visualization, Task analysis, Detectors, Object detection, Training, Pipelines BibRef

Wang, X., Liu, Y., Shen, C., Ng, C.C., Luo, C., Jin, L., Chan, C.S., van den Hengel, A., Wang, L.,
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering,
CVPR20(10123-10132)
IEEE DOI 2008
Measurement, Cognition, Knowledge discovery, Correlation, Task analysis, Visualization, Optical character recognition software BibRef

Xiong, P., Wu, Y.,
TA-Student VQA: Multi-Agents Training by Self-Questioning,
CVPR20(10062-10072)
IEEE DOI 2008
Visualization, Training, Knowledge discovery, Standards, Task analysis, Boosting BibRef

Agarwal, V., Shetty, R., Fritz, M.,
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing,
CVPR20(9687-9695)
IEEE DOI 2008
Data models, Robustness, Predictive models, Semantics, Correlation, Vocabulary, Visualization BibRef

Hu, R., Singh, A., Darrell, T.J., Rohrbach, M.,
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA,
CVPR20(9989-9999)
IEEE DOI 2008
Optical character recognition software, Task analysis, Feature extraction, Visualization, Iterative decoding, Vocabulary, Predictive models BibRef

Kafle, K., Shrestha, R., Price, B., Cohen, S., Kanan, C.,
Answering Questions about Data Visualizations using Efficient Bimodal Fusion,
WACV20(1487-1496)
IEEE DOI 2006
Bars, Data visualization, Image color analysis, Visualization, Task analysis, Optical character recognition software, Training BibRef

Patro, B.N., Kurmi, V.K., Kumar, S., Namboodiri, V.P.,
Deep Bayesian Network for Visual Question Generation,
WACV20(1555-1565)
IEEE DOI 2006
Bayes methods, Task analysis, Visualization, Uncertainty, Decoding, Probabilistic logic, Semantics BibRef

Patro, B.N., Patel, S., Namboodiri, V.P.,
Robust Explanations for Visual Question Answering,
WACV20(1566-1575)
IEEE DOI 2006
Visualization, Robustness, Perturbation methods, Knowledge discovery, Collaboration, Task analysis, Coherence BibRef

Chou, S., Chao, W., Lai, W., Sun, M., Yang, M.,
Visual Question Answering on 360° Images,
WACV20(1596-1605)
IEEE DOI 2006
Visualization, Task analysis, Feature extraction, Distortion, Cognition, Image color analysis, Spatial resolution BibRef

Chaudhry, R., Shekhar, S., Gupta, U., Maneriker, P., Bansal, P., Joshi, A.,
LEAF-QA: Locate, Encode Attend for Figure Question Answering,
WACV20(3501-3510)
IEEE DOI 2006
Bars, Knowledge discovery, Image color analysis, Training, Vocabulary, Data mining, Data visualization BibRef

Liang, Y.Z.[Yuan-Zhi], Bai, Y.L.[Ya-Long], Zhang, W.[Wei], Qian, X.M.[Xue-Ming], Zhu, L.[Li], Mei, T.[Tao],
VrR-VG: Refocusing Visually-Relevant Relationships,
ICCV19(10402-10411)
IEEE DOI 2004
bioinformatics, data mining, data visualisation, feature extraction, genomics, graph theory, image annotation, Cognition BibRef

Singh, A.K., Mishra, A., Shekhar, S., Chakraborty, A.,
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason,
ICCV19(4601-4611)
IEEE DOI 2004
document image processing, graph theory, inference mechanisms, neural nets, text analysis, visual content proposals, Proposals BibRef

Bhattacharya, N., Li, Q., Gurari, D.,
Why Does a Visual Question Have Different Answers?,
ICCV19(4270-4279)
IEEE DOI 2004
Code, Visual Q-A.
WWW Link. question answering (information retrieval), visual question answering, Visualization, Powders, Task analysis, Computer vision BibRef

Li, L., Gan, Z., Cheng, Y., Liu, J.,
Relation-Aware Graph Attention Network for Visual Question Answering,
ICCV19(10312-10321)
IEEE DOI 2004
data visualisation, graph theory, learning (artificial intelligence), object detection, Computational modeling BibRef

Peng, G.[Gao], You, H.X.[Hao-Xuan], Zhang, Z.P.[Zhan-Peng], Wang, X.G.[Xiao-Gang], Li, H.S.[Hong-Sheng],
Multi-Modality Latent Interaction Network for Visual Question Answering,
ICCV19(5824-5834)
IEEE DOI 2004
data visualisation, image representation, image retrieval, learning (artificial intelligence), Object detection BibRef

Do, T., Tran, H., Do, T., Tjiputra, E., Tran, Q.,
Compact Trilinear Interaction for Visual Question Answering,
ICCV19(392-401)
IEEE DOI 2004
learning (artificial intelligence), matrix decomposition, Correlation BibRef

Nguyen, D.K.[Duy-Kien], Okatani, T.[Takayuki],
Multi-Task Learning of Hierarchical Vision-Language Representation,
CVPR19(10484-10493).
IEEE DOI 2002
BibRef

Schwartz, I.[Idan], Yu, S.[Seunghak], Hazan, T.[Tamir], Schwing, A.G.[Alexander G.],
Factor Graph Attention,
CVPR19(2039-2048).
IEEE DOI 2002
BibRef

Kolesnikov, A.[Alexander], Beyer, L.[Lucas], Zhai, X.H.[Xiao-Hua], Puigcerver, J.[Joan], Yung, J.[Jessica], Gelly, S.[Sylvain], Houlsby, N.[Neil],
Big Transfer (BIT): General Visual Representation Learning,
ECCV20(V:491-507).
Springer DOI 2011
BibRef

Kolesnikov, A.[Alexander], Zhai, X.H.[Xiao-Hua], Beyer, L.[Lucas],
Revisiting Self-Supervised Visual Representation Learning,
CVPR19(1920-1929).
IEEE DOI 2002
BibRef

Xiong, P.X.[Pei-Xi], Zhan, H.Y.[Hua-Yi], Wang, X.[Xin], Sinha, B.[Baivab], Wu, Y.[Ying],
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning,
CVPR19(8349-8358).
IEEE DOI 2002
BibRef

Singh, A.[Amanpreet], Natarajan, V.[Vivek], Shah, M.[Meet], Jiang, Y.[Yu], Chen, X.L.[Xin-Lei], Batra, D.[Dhruv], Parikh, D.[Devi], Rohrbach, M.[Marcus],
Towards VQA Models That Can Read,
CVPR19(8309-8318).
IEEE DOI 2002
BibRef

Manjunatha, V.[Varun], Saini, N.[Nirat], Davis, L.S.[Larry S.],
Explicit Bias Discovery in Visual Question Answering Models,
CVPR19(9554-9563).
IEEE DOI 2002
BibRef

Shrestha, R.[Robik], Kafle, K.[Kushal], Kanan, C.[Christopher],
Answer Them All! Toward Universal Visual Question Answering Models,
CVPR19(10464-10473).
IEEE DOI 2002
BibRef

Zadeh, A.[Amir], Chan, M.[Michael], Liang, P.P.[Paul Pu], Tong, E.[Edmund], Morency, L.P.[Louis-Philippe],
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence,
CVPR19(8799-8809).
IEEE DOI 2002
BibRef

Noh, H.[Hyeonwoo], Kim, T.[Taehoon], Mun, J.[Jonghwan], Han, B.H.[Bo-Hyung],
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering,
CVPR19(8377-8386).
IEEE DOI 2002
BibRef

Wijmans, E.[Erik], Datta, S.[Samyak], Maksymets, O.[Oleksandr], Das, A.[Abhishek], Gkioxari, G.[Georgia], Lee, S.[Stefan], Essa, I.[Irfan], Parikh, D.[Devi], Batra, D.[Dhruv],
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception,
CVPR19(6652-6661).
IEEE DOI 2002
BibRef

Shah, M.[Meet], Chen, X.L.[Xin-Lei], Rohrbach, M.[Marcus], Parikh, D.[Devi],
Cycle-Consistency for Robust Visual Question Answering,
CVPR19(6642-6651).
IEEE DOI 2002
BibRef

Li, H.[Hui], Wang, P.[Peng], Shen, C.H.[Chun-Hua], van den Hengel, A.[Anton],
Visual Question Answering as Reading Comprehension,
CVPR19(6312-6321).
IEEE DOI 2002
BibRef

Yu, L.C.[Li-Cheng], Chen, X.L.[Xin-Lei], Gkioxari, G.[Georgia], Bansal, M.[Mohit], Berg, T.L.[Tamara L.], Batra, D.[Dhruv],
Multi-Target Embodied Question Answering,
CVPR19(6302-6311).
IEEE DOI 2002
BibRef

Yu, Z.[Zhou], Yu, J.[Jun], Cui, Y.[Yuhao], Tao, D.C.[Da-Cheng], Tian, Q.[Qi],
Deep Modular Co-Attention Networks for Visual Question Answering,
CVPR19(6274-6283).
IEEE DOI 2002
BibRef

Abbasnejad, E.[Ehsan], Wu, Q.[Qi], Shi, Q.F.[Qin-Feng], van den Hengel, A.[Anton],
What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions,
CVPR19(4150-4159).
IEEE DOI 2002
BibRef

Schwenk, D.[Dustin], Khandelwal, A.[Apoorv], Clark, C.[Christopher], Marino, K.[Kenneth], Mottaghi, R.[Roozbeh],
A-OKVQA: A Benchmark for Visual Question Answering Using World Knowledge,
ECCV22(VIII:146-162).
Springer DOI 2211
BibRef

Marino, K.[Kenneth], Rastegari, M.[Mohammad], Farhadi, A.[Ali], Mottaghi, R.[Roozbeh],
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge,
CVPR19(3190-3199).
IEEE DOI 2002
BibRef

Krishna, R.[Ranjay], Bernstein, M.[Michael], Fei-Fei, L.[Li],
Information Maximizing Visual Question Generation,
CVPR19(2008-2018).
IEEE DOI 2002
BibRef

Cadene, R.[Remi], Ben-younes, H.[Hedi], Cord, M.[Matthieu], Thome, N.[Nicolas],
MUREL: Multimodal Relational Reasoning for Visual Question Answering,
CVPR19(1989-1998).
IEEE DOI 2002
BibRef

Haurilet, M.[Monica], Roitberg, A.[Alina], Stiefelhagen, R.[Rainer],
It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning,
CVPR19(1930-1939).
IEEE DOI 2002
BibRef

Qiu, Y., Satoh, Y., Suzuki, R., Kataoka, H.,
Incorporating 3D Information Into Visual Question Answering,
3DV19(756-765)
IEEE DOI 1911
Feature extraction, Task analysis, Visualization, Natural language processing, Cognition, Human computer interaction BibRef

Haurilet, M.[Monica], Al-Halah, Z.[Ziad], Stiefelhagen, R.[Rainer],
DynGraph: Visual Question Answering via Dynamic Scene Graphs,
GCPR19(428-441).
Springer DOI 1911
BibRef
Earlier:
MoQA: A Multi-modal Question Answering Architecture,
VL18(IV:106-113).
Springer DOI 1905
BibRef

Liu, F., Liu, J., Fang, Z., Lu, H.,
Language and Visual Relations Encoding for Visual Question Answering,
ICIP19(3307-3311)
IEEE DOI 1910
Visual question answering, Relations, Attention BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Tang, Q.[Qu], Li, Y.[Yong], Lu, H.Q.[Han-Qing],
Answer Distillation for Visual Question Answering,
ACCV18(I:72-87).
Springer DOI 1906
BibRef

Kuhnle, A.[Alexander], Xie, H.Y.[Hui-Yuan], Copestake, A.[Ann],
How Clever Is the FiLM Model, and How Clever Can it Be?,
VL18(IV:162-172).
Springer DOI 1905
BibRef

Li, W.[Wei], Yuan, Z.H.[Ze-Huan], Fang, X.Z.[Xiang-Zhong], Wang, C.[Changhu],
Knowing Where to Look? Analysis on Attention of Visual Question Answering System,
VL18(IV:145-152).
Springer DOI 1905
BibRef

Wagner, M.[Misha], Basevi, H.[Hector], Shetty, R.[Rakshith], Li, W.B.[Wen-Bin], Malinowski, M.[Mateusz], Fritz, M.[Mario], Leonardis, A.[Aleš],
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions,
VLEASE18(I:521-537).
Springer DOI 1905
BibRef

Duke, B., Taylor, G.W.,
Generalized Hadamard-Product Fusion Operators for Visual Question Answering,
CRV18(39-46)
IEEE DOI 1812
Feature extraction, Visualization, Task analysis, Data models, Mathematical model, Natural languages, Model Selection, Visual Question-Answering BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
CVPR18(1-10)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Misra, I., Girshick, R., Fergus, R., Hebert, M., Gupta, A., van der Maaten, L.[Laurens],
Learning by Asking Questions,
CVPR18(11-20)
IEEE DOI 1812
Training, Proposals, Visualization, Knowledge discovery, Standards, Task analysis, Data models BibRef

Gurari, D., Li, Q., Stangl, A.J., Guo, A., Lin, C., Grauman, K., Luo, J., Bigham, J.P.,
VizWiz Grand Challenge: Answering Visual Questions from Blind People,
CVPR18(3608-3617)
IEEE DOI 1812
Visualization, Blindness, Prediction algorithms, Lighting, Mobile handsets, Shape BibRef

Li, J., Su, H., Zhu, J., Wang, S., Zhang, B.,
Textbook Question Answering Under Instructor Guidance with Memory Networks,
CVPR18(3655-3663)
IEEE DOI 1812
Task analysis, Cognition, Visualization, Feature extraction, Semantics, Knowledge discovery, Drugs BibRef

Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.,
IQA: Visual Question Answering in Interactive Environments,
CVPR18(4089-4098)
IEEE DOI 1812
Task analysis, Navigation, Visualization, Knowledge discovery, Semantics, Planning BibRef

Agrawal, A., Batra, D., Parikh, D., Kembhavi, A.,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering,
CVPR18(4971-4980)
IEEE DOI 1812
Image color analysis, Visualization, Data models, Training data, Training, Knowledge discovery, Dogs BibRef

Sha, F., Chao, W., Hu, H.,
Learning Answer Embeddings for Visual Question Answering,
CVPR18(5428-5436)
IEEE DOI 1812
Visualization, Semantics, Probabilistic logic, Computational modeling, Task analysis, Training, Adaptation models BibRef

Kafle, K., Price, B., Cohen, S., Kanan, C.,
DVQA: Understanding Data Visualizations via Question Answering,
CVPR18(5648-5656)
IEEE DOI 1812
Bars, Cognition, Image color analysis, Visualization, Data visualization, Data mining, Knowledge discovery BibRef

Sha, F., Hu, H., Chao, W.,
Cross-Dataset Adaptation for Visual Question Answering,
CVPR18(5716-5725)
IEEE DOI 1812
Visualization, Task analysis, Adaptation models, Knowledge discovery, Games, Training, Target recognition BibRef

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.,
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering,
CVPR18(6077-6086)
IEEE DOI 1812
Visualization, Task analysis, Proposals, Mathematical model, Servers, Context modeling, Object detection BibRef

Nguyen, D., Okatani, T.,
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering,
CVPR18(6087-6096)
IEEE DOI 1812
Feature extraction, Visualization, Fuses, Knowledge discovery, Bidirectional control BibRef

Ma, C., Shen, C., Dick, A., Wu, Q., Wang, P., van den Hengel, A.J.[Anton J.], Reid, I.D.,
Visual Question Answering with Memory-Augmented Networks,
CVPR18(6975-6984)
IEEE DOI 1812
Visualization, Neural networks, Training, Knowledge discovery, Feature extraction, Bidirectional control, Prediction algorithms BibRef

Patro, B., Namboodiri, V.P.,
Differential Attention for Visual Question Answering,
CVPR18(7680-7688)
IEEE DOI 1812
Semantics, Task analysis, Visualization, Knowledge discovery, Correlation, Measurement, Training BibRef

Su, Z.[Zhou], Zhu, C.[Chen], Dong, Y.P.[Yin-Peng], Cai, D.Q.[Dong-Qi], Chen, Y.R.[Yu-Rong], Li, J.G.[Jian-Guo],
Learning Visual Knowledge Memory Networks for Visual Question Answering,
CVPR18(7736-7745)
IEEE DOI 1812
Visualization, Knowledge based systems, Task analysis, Knowledge discovery, Cognition, Ovens BibRef

Shin, A., Ushiku, Y., Harada, T.,
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering,
CVPR18(8925-8933)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Proposals, Knowledge discovery, Recurrent neural networks, Training BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
DeepLearnRV18(2135-213509)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Cheng, W., Huang, Y., Wang, L.,
Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method,
ICPR18(3303-3308)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Training, Knowledge discovery, Proposals, Semantics BibRef

Teney, D., Anderson, P., He, X., van den Hengel, A.J.[Anton J.],
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge,
CVPR18(4223-4232)
IEEE DOI 1812
Training, Visualization, Task analysis, Neural networks, Knowledge discovery, Logic gates, Computer architecture BibRef

Zhou, B.[Bolei], Sun, Y.[Yiyou], Bau, D.[David], Torralba, A.B.[Antonio B.],
Interpretable Basis Decomposition for Visual Explanation,
ECCV18(VIII: 122-138).
Springer DOI 1810
BibRef

Shi, Y.[Yang], Furlanello, T.[Tommaso], Zha, S.[Sheng], Anandkumar, A.[Animashree],
Question Type Guided Attention in Visual Question Answering,
ECCV18(II: 158-175).
Springer DOI 1810
BibRef

Narasimhan, M.[Medhini], Schwing, A.G.[Alexander G.],
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering,
ECCV18(VIII: 460-477).
Springer DOI 1810
BibRef

Malinowski, M.[Mateusz], Doersch, C.[Carl], Santoro, A.[Adam], Battaglia, P.[Peter],
Learning Visual Question Answering by Bootstrapping Hard Attention,
ECCV18(VI: 3-20).
Springer DOI 1810
BibRef

Gu, J.X.[Jiu-Xiang], Cai, J.F.[Jian-Fei], Joty, S.[Shafiq], Niu, L.[Li], Wang, G.[Gang],
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models,
CVPR18(7181-7189)
IEEE DOI 1812
Visualization, Training, Decoding, Semantics, Measurement. BibRef

Li, Q.[Qing], Tao, Q.Y.[Qing-Yi], Joty, S.[Shafiq], Cai, J.F.[Jian-Fei], Luo, J.B.[Jie-Bo],
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions,
ECCV18(VII: 570-586).
Springer DOI 1810
BibRef

Bai, Y.L.[Ya-Long], Fu, J.L.[Jian-Long], Zhao, T.J.[Tie-Jun], Mei, T.[Tao],
Deep Attention Neural Tensor Network for Visual Question Answering,
ECCV18(XII: 21-37).
Springer DOI 1810
BibRef

Sinha, A.[Abhishek], Ayush, K.[Kumar],
Towards Mathematical Reasoning: A Multimodal Deep Learning Approach,
ICIP18(4028-4032)
IEEE DOI 1809
Mathematical model, Task analysis, Visualization, Decoding, Computational modeling, Machine learning, Numerical models, Mathematical Reasoning BibRef

Yu, D., Gao, X., Xiong, H.,
Structured Semantic Representation for Visual Question Answering,
ICIP18(2286-2290)
IEEE DOI 1809
Semantics, Training, Cognition, Visualization, Task analysis, Linguistics, Computational modeling, Visual question answering BibRef

Huang, L., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., Turaga, P.K.,
CS-VQA: Visual Question Answering with Compressively Sensed Images,
ICIP18(1283-1287)
IEEE DOI 1809
Visualization, Image reconstruction, Image coding, Task analysis, Feature extraction, Training, Multiplexing, image reconstruction BibRef

Desta, M.T., Chen, L., Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI 1806
data visualisation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Zhao, H., Fan, Q., Gutfreund, D., Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI 1806
data visualisation, image colour analysis, image representation, learning (artificial intelligence), Visualization BibRef

Wang, Z., Liu, X., Wang, L., Qiao, Y., Xie, X., Fowlkes, C.C.[Charless C.],
Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering,
WACV18(1888-1896)
IEEE DOI 1806
convolution, data visualisation, learning (artificial intelligence), Visualization BibRef

Chowdhury, I., Nguyen, K., Fookes, C., Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual question answering (VQA),
ICIP17(1842-1846)
IEEE DOI 1803
Feature extraction, Mathematical model, Natural languages, Principal component analysis, Task analysis, Training, scene understanding BibRef

Sheng, S.[Shurong], Venkitasubramanian, A.N.[Aparna Nurani], Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI 1802
BibRef

Rosso-Mateus, A.[Andrés], González, F.A.[Fabio A.], Montes-y-Gómez, M.[Manuel],
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain Question Answering,
CIARP17(566-574).
Springer DOI 1802
BibRef

Gupta, T.[Tanmay], Shih, K.J.[Kevin J.], Singh, S.[Saurabh], Hoiem, D.[Derek],
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks,
ICCV17(4223-4232)
IEEE DOI 1802
data visualisation, image recognition, learning (artificial intelligence), Visualization BibRef

Yu, Z., Yu, J., Fan, J., Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI 1802
computational complexity, feature extraction, image fusion, learning (artificial intelligence), Visualization BibRef

Ben-younes, H., Cadene, R., Cord, M., Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI 1802
image fusion, image representation, question answering (information retrieval), tensors, (VQA) tasks, Visualization BibRef

Zhu, C., Zhao, Y., Huang, S., Tu, K., Ma, Y.,
Structured Attentions for Visual Question Answering,
ICCV17(1300-1309)
IEEE DOI 1802
belief networks, data visualisation, image retrieval, inference mechanisms, neural nets, Visualization BibRef

Hu, R., Andreas, J., Rohrbach, M., Darrell, T.J., Saenko, K.,
Learning to Reason: End-to-End Module Networks for Visual Question Answering,
ICCV17(804-813)
IEEE DOI 1802
computational linguistics, grammars, natural language processing, neural net architecture, Visualization BibRef

Jain, U.[Unnat], Zhang, Z.Y.[Zi-Yu], Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational Autoencoders,
CVPR17(5415-5424)
IEEE DOI 1711
Artificial intelligence, Creativity, Hidden Markov models, Training, Transforms, Visualization BibRef

Zhu, Y., Lim, J.J., Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative Querying,
CVPR17(6146-6155)
IEEE DOI 1711
Computational modeling, Data models, Generators, Knowledge discovery, Standards, Visualization BibRef

Peris, Á.[Álvaro], Casacuberta, F.[Francisco],
Interactive-Predictive Neural Multimodal Systems,
IbPRIA(I:16-28).
Springer DOI 1910
BibRef

Bolaños, M.[Marc], Peris, Á.[Álvaro], Casacuberta, F.[Francisco], Radeva, P.[Petia],
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering,
IbPRIA17(372-380).
Springer DOI 1706
BibRef

Gao, P.[Peng], Li, H.S.[Hong-Sheng], Li, S.[Shuang], Lu, P.[Pan], Li, Y.K.[Yi-Kang], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang],
Question-Guided Hybrid Convolution for Visual Question Answering,
ECCV18(I: 485-501).
Springer DOI 1810
BibRef

Uehara, K.[Kohei], Duan, N.[Nan], Harada, T.[Tatsuya],
Learning to Ask Informative Sub-Questions for Visual Question Answering,
MULA22(4680-4689)
IEEE DOI 2210
Training, Visualization, Computational modeling, Reinforcement learning, Predictive models BibRef

Li, Y.K.[Yi-Kang], Duan, N.[Nan], Zhou, B.L.[Bo-Lei], Chu, X.[Xiao], Ouyang, W.L.[Wan-Li], Wang, X.G.[Xiao-Gang], Zhou, M.[Ming],
Visual Question Generation as Dual Task of Visual Question Answering,
CVPR18(6116-6124)
IEEE DOI 1812
Task analysis, Visualization, Knowledge discovery, Training, Computational modeling BibRef

Gao, P.[Peng], Jiang, Z.K.[Zheng-Kai], You, H.X.[Hao-Xuan], Lu, P.[Pan], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang], Li, H.S.[Hong-Sheng],
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering,
CVPR19(6632-6641).
IEEE DOI 2002
BibRef

Lin, Y.T.[Yue-Tan], Pang, Z.Y.[Zhang-Yang], Li, Y.[Yanan], Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI 1610
Benchmark testing. Not just add text to image questions. BibRef

Kafle, K.[Kushal], Kanan, C.[Christopher],
An Analysis of Visual Question Answering Algorithms,
ICCV17(1983-1991)
IEEE DOI 1802
BibRef
Earlier:
Answer-Type Prediction for Visual Question Answering,
CVPR16(4976-4984)
IEEE DOI 1612
case-based reasoning, data visualisation, image retrieval, neural nets, Visualization BibRef

Wang, P., Wu, Q., Shen, C., van den Hengel, A.J.[Anton J.],
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions,
CVPR17(3909-3918)
IEEE DOI 1711
Cognition, Data mining, Neural networks, Prediction algorithms, Telescopes, Visualization BibRef

Yu, D., Fu, J., Mei, T., Rui, Y.,
Multi-level Attention Networks for Visual Question Answering,
CVPR17(4187-4195)
IEEE DOI 1711
Feature extraction, Knowledge discovery, Natural languages, Recurrent neural networks, Semantics, Visualization BibRef

Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI 1711
Cognition, Knowledge discovery, Natural languages, Training, Visualization BibRef

Ganju, S., Russakovsky, O., Gupta, A.,
What's in a Question: Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI 1711
Artificial intelligence, Computational modeling, Dogs, Image color analysis, SPICE, Visualization BibRef

Ramakrishnan, S.K., Pal, A., Sharma, G., Mittal, A.,
An Empirical Evaluation of Visual Question Answering for Novel Objects,
CVPR17(7312-7321)
IEEE DOI 1711
Knowledge discovery, Recurrent neural networks, Training, Training data, Visualization, Vocabulary BibRef

Xu, H.J.[Hui-Juan], Saenko, K.[Kate],
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI 1611
Visual Question Answering. BibRef

Jabri, A.[Allan], Joulin, A.[Armand], van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI 1611
BibRef

Yang, Z.C.[Zi-Chao], He, X.D.[Xiao-Dong], Gao, J.F.[Jian-Feng], Deng, L.[Li], Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI 1612
BibRef

Sadeghi, F.[Fereshteh], Divvala, S.K.[Santosh K.], Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI 1510
Visual verification of text relationships. BibRef

Liu, Y.[Yang], Liu, J.[Jie], Wang, D.[Dong], Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering enrichment,
ICIP12(1917-1920).
IEEE DOI 1302
BibRef

Varekamp, C.[Chris], van de Walle, P.[Patrick], de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic digital picture frame,
3DTV09(1-4).
IEEE DOI 0905
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Video Question Answering, Movies, Spatio-Temporal, Query, VQA .


Last update:Jan 29, 2023 at 20:54:24