20.4.3.3 Visual Question Answering, Query, VQA

Chapter Contents (Back)
Question Answer. Visual Q-A. VQA. Subsets:
See also VQA, Visual Question Answering, Neural Networks.
See also Vision-Language Models, Language-Vision Models, VQA.
See also Video Question Answering, Movies, Spatio-Temporal, Query, VQA.
See also Visual Dialog. And the related:
See also Visual Grounding, Grounding Expressions.
See also Visual Question Answering, Datasets, Benchmarks, Surveys. Other Datasets may be in:
See also Object Recognition, Retrieval Datasets.
See also Context in Computer Vision.

Agrawal, A.[Aishwarya], Lu, J.[Jiasen], Antol, S.[Stanislaw], Mitchell, M.[Margaret], Zitnick, C.L.[C. Lawrence], Parikh, D.[Devi], Batra, D.[Dhruv],
VQA: Visual Question Answering,
IJCV(123), No. 1, May 2017, pp. 4-31.
Springer DOI 1705
BibRef

Lioutas, V.[Vasileios], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Explicit ensemble attention learning for improving visual question answering,
PRL(111), 2018, pp. 51-57.
Elsevier DOI 1808
Visual question answering, Explicit attention, Pictorial superiority effect BibRef

Garg, S.[Shivam], Srivastava, R.[Rajeev],
Object sequences: encoding categorical and spatial information for a yes/no visual question answering task,
IET-CV(12), No. 8, December 2018, pp. 1141-1150.
DOI Link 1812
BibRef

Goyal, Y.[Yash], Khot, T.[Tejas], Agrawal, A.[Aishwarya], Summers-Stay, D.[Douglas], Batra, D.[Dhruv], Parikh, D.[Devi],
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering,
IJCV(127), No. 4, April 2019, pp. 398-414.
Springer DOI 1903
BibRef
Earlier: A1, A2, A4, A5, A6, Only: CVPR17(6325-6334)
IEEE DOI 1711
Benchmark testing, Data collection, Data models, Knowledge discovery, Protocols, Visualization BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Li, Y.[Yong], Qiao, Y.Y.[Yan-Yuan], Lu, H.Q.[Han-Qing],
Improving visual question answering using dropout and enhanced question encoder,
PR(90), 2019, pp. 404-414.
Elsevier DOI 1903
Visual question answering, Coherent dropout, Siamese dropout, Enhanced question encoder BibRef

Osman, A.[Ahmed], Samek, W.[Wojciech],
DRAU: Dual Recurrent Attention Units for Visual Question Answering,
CVIU(185), 2019, pp. 24-30.
Elsevier DOI 1906
Visual Question Answering, Attention Mechanisms, Multi-modal Learning, Machine Vision, Natural Language Processing BibRef

Toor, A.S.[Andeep S.], Wechsler, H.[Harry], Nappi, M.[Michele],
Biometric surveillance using visual question answering,
PRL(126), 2019, pp. 111-118.
Elsevier DOI 1909
Biometrics, Forensics, Visual question answering, Question relevance, Surveillance, Deep learning, Visual turing test BibRef

Li, W.W.[Wen-Wen], Song, M.M.[Miao-Miao], Tian, Y.Y.[Yuan-Yuan],
An Ontology-Driven Cyberinfrastructure for Intelligent Spatiotemporal Question Answering and Open Knowledge Discovery,
IJGI(8), No. 11, 2019, pp. xx-yy.
DOI Link 1912
BibRef

Xi, Y.L.[Yu-Ling], Zhang, Y.N.[Yan-Ning], Ding, S.T.[Song-Tao], Wan, S.H.[Shao-Hua],
Visual Question Answering Model Based on Visual Relationship Detection,
SP:IC(80), 2020, pp. 115648.
Elsevier DOI 1912
Visual question answering, Appearance features, Relationship predicate, Word vector similarity BibRef

Wu, Y., Jiang, L., Yang, Y.,
Revisiting EmbodiedQA: A Simple Baseline and Beyond,
IP(29), 2020, pp. 3984-3992.
IEEE DOI 2002
Embodied question answering, vision and language, visual question answering BibRef

Huang, C.R.[Chao-Ran], Yao, L.[Lina], Wang, X.Z.[Xian-Zhi], Benatallah, B.[Boualem], Zhang, X.[Xiang],
Software expert discovery via knowledge domain embeddings in a collaborative network,
PRL(130), 2020, pp. 46-53.
Elsevier DOI 2002
Knowledge discovery, Stack overflow, Expertise finding, Question answering, Expert as a Service BibRef

Li, W.[Wei], Sun, J.H.[Jian-Hui], Liu, G.[Ge], Zhao, L.[Linglan], Fang, X.Z.[Xiang-Zhong],
Visual question answering with attention transfer and a cross-modal gating mechanism,
PRL(133), 2020, pp. 334-340.
Elsevier DOI 2005
Attention, Visual question answering, Gating BibRef

Messina, N.[Nicola], Amato, G.[Giuseppe], Carrara, F.[Fabio], Falchi, F.[Fabrizio], Gennaro, C.[Claudio],
Learning visual features for relational CBIR,
MultInfoRetr(9), No. 2, June 2020, pp. 113-124.
Springer DOI 2005
BibRef
Earlier:
Learning Relationship-Aware Visual Features,
CEFR-LCV18(IV:486-501).
Springer DOI 1905
BibRef

Methani, N., Ganguly, P., Khapra, M.M., Kumar, P.,
PlotQA: Reasoning over Scientific Plots,
WACV20(1516-1525)
IEEE DOI 2006
Vocabulary, Cognition, Bars, Numerical models, Optical character recognition software, Data mining, Image color analysis BibRef

Yu, J.[Jing], Zhu, Z.H.[Zi-Hao], Wang, Y.J.[Yu-Jing], Zhang, W.F.[Wei-Feng], Hu, Y.[Yue], Tan, J.L.[Jian-Long],
Cross-modal knowledge reasoning for knowledge-based visual question answering,
PR(108), 2020, pp. 107563.
Elsevier DOI 2008
Cross-modal knowledge reasoning, Multimodal knowledge graphs, Compositional reasoning module, Explainable reasoning BibRef

Yang, Z.Q.[Zhuo-Qian], Qin, Z.C.[Zeng-Chang], Yu, J.[Jing], Wan, T.[Tao],
Prior Visual Relationship Reasoning For Visual Question Answering,
ICIP20(1411-1415)
IEEE DOI 2011
Visualization, Semantics, Convolution, Cognition, Knowledge discovery, Benchmark testing, Measurement, VQA, GCN, Attention Mechanism BibRef

Farazi, M.R.[Moshiur R.], Khan, S.H.[Salman H.], Barnes, N.M.[Nick M.],
From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts,
IVC(103), 2020, pp. 103985.
Elsevier DOI 2011
Visual Question Answering, Deep learning, Natural language processing, Dataset bias BibRef

Terao, K.[Kento], Tamaki, T.[Toru], Raytchev, B.[Bisser], Kaneda, K.[Kazufumi], Satoh, S.[Shin'ichi],
Rephrasing Visual Questions by Specifying the Entropy of the Answer Distribution,
IEICE(E103-D), No. 11, November 2020, pp. 2362-2370.
WWW Link. 2011
BibRef

Yu, J.[Jing], Zhang, W.F.[Wei-Feng], Lu, Y.H.[Yu-Hang], Qin, Z.C.[Zeng-Chang], Hu, Y.[Yue], Tan, J.L.[Jian-Long], Wu, Q.[Qi],
Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval,
MultMed(22), No. 12, December 2020, pp. 3196-3209.
IEEE DOI 2011
Visualization, Cognition, Task analysis, Knowledge discovery, Semantics, Correlation, Information retrieval, cross-modal information retrieval BibRef

Lobry, S., Marcos, D., Murray, J., Tuia, D.,
RSVQA: Visual Question Answering for Remote Sensing Data,
GeoRS(58), No. 12, December 2020, pp. 8555-8566.
IEEE DOI 2012
Remote sensing, Task analysis, Visualization, Data models, Feature extraction, Knowledge discovery, visual question answering (VQA) BibRef

Faure, M.[Maxime], Lobry, S.[Sylvain], Kurtz, C.[Camille], Wendling, L.[Laurent],
Embedding Spatial Relations in Visual Question Answering for Remote Sensing,
ICPR22(310-316)
IEEE DOI 2212
Training, Visualization, Histograms, Feature extraction, Question answering (information retrieval), Spatial databases. BibRef

Chappuis, C.[Christel], Zermatten, V.[Valérie], Lobry, S.[Sylvain], Le Saux, B.[Bertrand], Tuia, D.[Devis],
Prompt-RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering,
EarthVision22(1371-1380)
IEEE DOI 2210
Training, Visualization, Natural languages, Feature extraction, Transformers, Question answering (information retrieval), Data mining BibRef

Sun, B.[Bo], Yao, Z.[Zeng], Zhang, Y.H.[Ying-Hui], Yu, L.J.[Le-Jun],
Local relation network with multilevel attention for visual question answering,
JVCIR(73), 2020, pp. 102762.
Elsevier DOI 2012
Visual question answering, Relation network, Attention mechanism BibRef

Li, X., Yuan, A., Lu, X.,
Vision-to-Language Tasks Based on Attributes and Attention Mechanism,
Cyber(51), No. 2, February 2021, pp. 913-926.
IEEE DOI 2101
Semantics, Task analysis, Visualization, Cats, Natural languages, Knowledge discovery, Feature extraction, Deep learning, visual question answering (VQA) BibRef

Shao, Y.[Yinan], Lin, J.C.W.[Jerry Chun-Wei], Srivastava, G.[Gautam], Jolfaei, A.[Alireza], Guo, D.D.[Dong-Dong], Hu, Y.[Yi],
Self-attention-based conditional random fields latent variables model for sequence labeling,
PRL(145), 2021, pp. 157-164.
Elsevier DOI 2104
Latent CRF, Sequence labeling, Encoding schema, Natural language processing, VQA, Big data BibRef

Wu, Y.[Yirui], Ma, Y.T.[Yun-Tao], Wan, S.H.[Shao-Hua],
Multi-scale relation reasoning for multi-modal Visual Question Answering,
SP:IC(96), 2021, pp. 116319.
Elsevier DOI 2106
Multi-modal data, Visual Question Answering, Multi-scale relation reasoning, Attention model BibRef

Ma, Y.T.[Yun-Tao], Lu, T.[Tong], Wu, Y.[Yirui],
Multi-scale Relational Reasoning with Regional Attention for Visual Question Answering,
ICPR21(5642-5649)
IEEE DOI 2105
Visualization, Neural networks, Knowledge discovery, Cognition, Robustness, Data mining, Visual question learning, Attention, Multi-scale relational reasoning BibRef

dos S-Silva, F.H.[Francisco H.], Bezerra, G.M.[Gabriel M.], Holanda, G.B.[Gabriel B.], de Souza, J.W.M.[J. Wellington M.], Rego, P.A.L.[Paulo A.L.], Lira Neto, A.V.[Aloísio V.], de Albuquerque, V.H.C.[Victor Hugo C.], Rebouças Filho, P.P.[Pedro P.],
A novel feature extractor for human action recognition in visual question answering,
PRL(147), 2021, pp. 41-47.
Elsevier DOI 2106
BibRef

Guo, W.[Wenya], Zhang, Y.[Ying], Yang, J.F.[Ju-Feng], Yuan, X.J.[Xiao-Jie],
Re-Attention for Visual Question Answering,
IP(30), 2021, pp. 6730-6743.
IEEE DOI 2108
Visualization, Tires, Task analysis, Feature extraction, Training, Knowledge discovery, Image reconstruction, gating mechanism BibRef

Hu, J.[Jun], Qian, S.S.[Sheng-Sheng], Fang, Q.[Quan], Xu, C.S.[Chang-Sheng],
Heterogeneous Community Question Answering via Social-Aware Multi-Modal Co-Attention Convolutional Matching,
MultMed(23), 2021, pp. 2321-2334.
IEEE DOI 2108
Visualization, Semantics, Knowledge discovery, Context modeling, Portable computers, Task analysis, Object detection, social multimedia BibRef

Zhang, X.[Xi], Zhang, F.F.[Fei-Fei], Xu, C.S.[Chang-Sheng],
NExT-OOD: Overcoming Dual Multiple-Choice VQA Biases,
PAMI(46), No. 4, April 2024, pp. 1913-1931.
IEEE DOI 2403
Visualization, Feature extraction, Benchmark testing, Correlation, Predictive models, Cognition, Training, Benchmark, bias, multiple -choice VQA BibRef

Farazi, M.[Moshiur], Khan, S.[Salman], Barnes, N.M.[Nick M.],
Accuracy vs. complexity: A trade-off in visual question answering models,
PR(120), 2021, pp. 108106.
Elsevier DOI 2109
Visual question answering, Visual feature extraction, Language features, Multi-modal fusion, Speed-accuracy trade-off BibRef

Barra, S.[Silvio], Bisogni, C.[Carmen], de Marsico, M.[Maria], Ricciardi, S.[Stefano],
Visual question answering: Which investigated applications?,
PRL(151), 2021, pp. 325-331.
Elsevier DOI 2110
Visual question answering, Real-world VQA, VQA for medical applicatons, VQA for assistive applications, VQA in cultural heritage and education BibRef

Manmadhan, S.[Sruthy], Kovoor, B.C.[Binsu C.],
Multi-Tier Attention Network using Term-weighted Question Features for Visual Question Answering,
IVC(115), 2021, pp. 104291.
Elsevier DOI 2110
Attention mechanism, Deep learning, Semantic similarity, Supervised term weighting, Visual Question Answering BibRef

Liu, A.A.[An-An], Lu, Z.[Zimu], Xu, N.[Ning], Nie, W.Z.[Wei-Zhi], Li, W.H.[Wen-Hui],
Multi-type decision fusion network for visual Q&A,
IVC(115), 2021, pp. 104281.
Elsevier DOI 2110
Visual question answering, Multi-type question, Scene graph BibRef

Patro, B.N.[Badri N.], Kurmi, V.K.[Vinod K.], Kumar, S.[Sandeep], Namboodiri, V.P.[Vinay P.],
MUMC: Minimizing uncertainty of mixture of cues,
IVC(115), 2021, pp. 104280.
Elsevier DOI 2110
Uncertainty estimation, Mixture of cues, Visual Question Answering, Paraphrase, Encoder-decoder BibRef

Liu, F.[Fei], Liu, J.[Jing], Fang, Z.W.[Zhi-Wei], Hong, R.C.[Ri-Chang], Lu, H.Q.[Han-Qing],
Visual Question Answering With Dense Inter- and Intra-Modality Interactions,
MultMed(23), 2021, pp. 3518-3529.
IEEE DOI 2110
Visualization, Knowledge discovery, Connectors, Encoding, Task analysis, Image coding, Stacking, Visual question answering, dense interactions BibRef

Wu, J.J.[Jia-Jia], Du, J.[Jun], Wang, F.[Fengren], Yang, C.[Chen], Jiang, X.Z.[Xin-Zhe], Hu, J.[Jinshui], Yin, B.[Bing], Zhang, J.S.[Jian-Shu], Dai, L.R.[Li-Rong],
A multimodal attention fusion network with a dynamic vocabulary for TextVQA,
PR(122), 2022, pp. 108214.
Elsevier DOI 2112
Dynamic vocabulary, Attention map, Multimodal fusion, ST-VQA BibRef

Narayanan, A.[Abhishek], Rao, A.[Abijna], Prasad, A.[Abhishek], Natarajan, S.,
VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering,
IVC(116), 2021, pp. 104328.
Elsevier DOI 2112
Visual question answering, Factoid question answering, Knowledge based reasoning, Explainable VQA BibRef

Guo, Y.Y.[Yang-Yang], Nie, L.Q.[Li-Qiang], Cheng, Z.Y.[Zhi-Yong], Tian, Q.[Qi], Zhang, M.[Min],
Loss Re-Scaling VQA: Revisiting the Language Prior Problem From a Class-Imbalance View,
IP(31), 2022, pp. 227-238.
IEEE DOI 2112
Visualization, Training, Computational modeling, Benchmark testing, Predictive models, Cognition, Task analysis, loss re-scaling BibRef

Peng, L.[Liang], Yang, Y.[Yang], Wang, Z.[Zheng], Huang, Z.[Zi], Shen, H.T.[Heng Tao],
MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network,
PAMI(44), No. 1, January 2022, pp. 318-329.
IEEE DOI 2112
Visualization, Feature extraction, Semantics, Knowledge discovery, Cognition, Task analysis, Natural languages, relation attention BibRef

Manogaran, G.[Gunasekaran], Shakeel, P.M.[P. Mohamed], Burhanuddin, M.A., Baskar, S., Saravanan, V.[Vijayalakshmi], Crespo, R.G.[Rubén González], Martínez, O.S.[Oscar Sanjuán],
ADCCF: Adaptive deep concatenation coder framework for visual question answering,
PRL(152), 2021, pp. 348-355.
Elsevier DOI 2112
BibRef

Zhou, Y.[Yiyi], Ji, R.R.[Rong-Rong], Sun, X.S.[Xiao-Shuai], Su, J.S.[Jin-Song], Meng, D.Y.[De-Yu], Gao, Y.[Yue], Shen, C.H.[Chun-Hua],
Plenty is Plague: Fine-Grained Learning for Visual Question Answering,
PAMI(44), No. 2, February 2022, pp. 697-709.
IEEE DOI 2201
Training, Visualization, Knowledge discovery, Redundancy, Data models, Feature extraction, Training data, visual question answering BibRef

E, W.N.[Wei-Nan], Zhou, Y.J.[Ya-Jun],
A Mathematical Model for Universal Semantics,
PAMI(44), No. 3, March 2022, pp. 1124-1132.
IEEE DOI 2202
Semantics, Numerical models, Pattern analysis, Markov processes, Statistical analysis, Exponential distribution, question answering BibRef

Li, X.P.[Xiang-Peng], Wu, B.[Bo], Song, J.K.[Jing-Kuan], Gao, L.L.[Lian-Li], Zeng, P.P.[Peng-Peng], Gan, C.[Chuang],
Text-instance graph: Exploring the relational semantics for text-based visual question answering,
PR(124), 2022, pp. 108455.
Elsevier DOI 2203
Text-based visual question answering, Spatial overlapping, Text-Instance graph, Copy mechanism BibRef

Shao, X.J.[Xiang-Jun], Xiang, Z.L.[Zheng-Long], Li, Y.X.[Yuan-Xiang],
Visual question answering with gated relation-aware auxiliary,
IET-IPR(16), No. 5, 2022, pp. 1424-1432.
DOI Link 2203
BibRef

Liu, Y.[Yun], Zhang, X.M.[Xiao-Ming], Zhao, Z.Y.[Zhi-Yun], Zhang, B.[Bo], Cheng, L.[Lei], Li, Z.J.[Zhou-Jun],
ALSA: Adversarial Learning of Supervised Attentions for Visual Question Answering,
Cyber(52), No. 6, June 2022, pp. 4520-4533.
IEEE DOI 2207
Visualization, Correlation, Generators, Feature extraction, Task analysis, Knowledge discovery, Fuses, Adversarial learning, visual question answering (VQA) BibRef

Ouyang, N.L.[Ning-Lin], Huang, Q.B.[Qing-Bao], Li, P.J.[Pi-Jian], Cai, Y.[Yi], Liu, B.[Bin], Leung, H.F.[Ho-Fung], Li, Q.[Qing],
Suppressing Biased Samples for Robust VQA,
MultMed(24), 2022, pp. 3405-3415.
IEEE DOI 2207
Training, Visualization, Training data, Image color analysis, Sports, Knowledge discovery, Annotations, Visual Question Answering, Robust VQA BibRef

Shuang, K.[Kai], Guo, J.[Jinyu], Wang, Z.H.[Zi-Han],
Comprehensive-perception dynamic reasoning for visual question answering,
PR(131), 2022, pp. 108878.
Elsevier DOI 2208
Cross-modal information fusion, Visual question answering, Comprehensive perception, Relational reasoning BibRef

Gouthaman, K.V., Mittal, A.[Anurag],
On the role of question encoder sequence model in robust visual question answering,
PR(131), 2022, pp. 108883.
Elsevier DOI 2208
Visual question answering, Out-of-distribution performance, Gated recurrent unit, Transformer, Graph attention network BibRef

Chen, C.Q.[Chong-Qing], Han, D.Z.[De-Zhi], Chang, C.C.[Chin-Chen],
CAAN: Context-Aware attention network for visual question answering,
PR(132), 2022, pp. 108980.
Elsevier DOI 2209
Visual question answering, Attention mechanism, Understanding bias, Absolute position, Contextual information BibRef

Xie, J.Y.[Jia-Yuan], Fang, W.H.[Wen-Hao], Cai, Y.[Yi], Huang, Q.B.[Qing-Bao], Li, Q.[Qing],
Knowledge-Based Visual Question Generation,
CirSysVideo(32), No. 11, November 2022, pp. 7547-7558.
IEEE DOI 2211
Visualization, Feature extraction, Task analysis, Knowledge based systems, Knowledge representation, Decoding, multimodal BibRef

Gao, C.Y.[Chen-Yu], Zhu, Q.[Qi], Wang, P.[Peng], Li, H.[Hui], Liu, Y.L.[Yu-Liang], van den Hengel, A.J.[Anton J.], Wu, Q.[Qi],
Structured Multimodal Attentions for TextVQA,
PAMI(44), No. 12, December 2022, pp. 9603-9614.
IEEE DOI 2212
Optical character recognition software, Cognition, Visualization, Text recognition, Task analysis, Knowledge discovery, Annotations, transformer BibRef

Jin, Z.X.[Zan-Xia], Wu, H.[Heran], Yang, C.[Chun], Zhou, F.[Fang], Qin, J.Y.[Jing-Yan], Xiao, L.[Lei], Yin, X.C.[Xu-Cheng],
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering,
MultMed(25), 2023, pp. 1-12.
IEEE DOI 2301
Optical character recognition software, Semantics, Visualization, Cognition, Knowledge discovery, Task analysis, Attention mechanism, visual question answering BibRef

Beckham, C.[Christopher], Weiss, M.[Martin], Golemo, F.[Florian], Honari, S.[Sina], Nowrouzezahrai, D.[Derek], Pal, C.[Christopher],
Visual question answering from another perspective: CLEVR mental rotation tests,
PR(136), 2023, pp. 109209.
Elsevier DOI 2301
Deep learning, Computer vision, Visual question answering, Contrastive learning, Clevr BibRef

Zhang, H.N.[Hao-Nan], Zeng, P.P.[Peng-Peng], Hu, Y.X.[Yu-Xuan], Qian, J.[Jin], Song, J.K.[Jing-Kuan], Gao, L.[Lianli],
Learning visual question answering on controlled semantic noisy labels,
PR(138), 2023, pp. 109339.
Elsevier DOI 2303
Visual question answering, Noisy datasets, Semantic labels, Contrastive learning BibRef

Zeng, G.[Gangyan], Zhang, Y.[Yuan], Zhou, Y.[Yu], Yang, X.M.[Xiao-Meng], Jiang, N.[Ning], Zhao, G.Q.[Guo-Qing], Wang, W.P.[Wei-Ping], Yin, X.C.[Xu-Cheng],
Beyond OCR + VQA: Towards end-to-end reading and reasoning for robust and accurate textvqa,
PR(138), 2023, pp. 109337.
Elsevier DOI 2303
Textvqa, End-to-end, Scene text reading, Scene text reasoning BibRef

Gao, D.F.[Di-Fei], Wang, R.P.[Rui-Ping], Shan, S.G.[Shi-Guang], Chen, X.L.[Xi-Lin],
CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense,
PAMI(45), No. 5, May 2023, pp. 5561-5578.
IEEE DOI 2304
Visualization, Task analysis, Tail, Head, Annotations, Magnetic heads, Mouth, Visual question answering, compositional reasoning, dataset construction BibRef

Xu, F.Z.[Fang-Zhi], Lin, Q.[Qika], Liu, J.[Jun], Zhang, L.L.[Ling-Ling], Zhao, T.Z.[Tian-Zhe], Chai, Q.[Qi], Pan, Y.[Yudai], Huang, Y.[Yi], Wang, Q.[Qianying],
MoCA: Incorporating domain pretraining and cross attention for textbook question answering,
PR(140), 2023, pp. 109588.
Elsevier DOI 2305
Textbook question answering, Multimodal, Pretraining, Attention BibRef

Li, P.[Pengju], Tan, Z.[Zhiyi], Bao, B.K.[Bing-Kun],
Multiview Language Bias Reduction for Visual Question Answering,
MultMedMag(30), No. 1, January 2023, pp. 91-99.
IEEE DOI 2305
Visualization, Training, Image color analysis, Predictive models, Task analysis, Visualization, inter-question type bias BibRef

Li, H.M.[Hui-Min], Han, D.Z.[De-Zhi], Chen, C.Q.[Chong-Qing], Chang, C.C.[Chin-Chen], Li, K.C.[Kuan-Ching], Li, D.[Dun],
A Visual Question Answering Network Merging High- and Low-Level Semantic Information,
IEICE(E106-D), No. 5, May 2023, pp. 581-589.
WWW Link. 2305
BibRef

Liu, B.[Bo], Zhan, L.M.[Li-Ming], Xu, L.[Li], Wu, X.M.[Xiao-Ming],
Medical Visual Question Answering via Conditional Reasoning and Contrastive Learning,
MedImg(42), No. 5, May 2023, pp. 1532-1545.
IEEE DOI 2305
Task analysis, Feature extraction, Visualization, Cognition, Question answering (information retrieval), Training, Radiology, contrastive learning BibRef

Wu, J.M.[Jin-Meng], Ge, F.[Fulin], Hong, H.Y.[Han-Yu], Shi, Y.[Yu], Hao, Y.B.[Yan-Bin], Ma, L.[Lei],
Question-aware dynamic scene graph of local semantic representation learning for visual question answering,
PRL(170), 2023, pp. 93-99.
Elsevier DOI 2306
Interactive semantic representation, Dynamic scene graph, Local feature detection, Attention mechanism BibRef

Li, H.[Hao], Huang, J.[Jinfa], Jin, P.[Peng], Song, G.[Guoli], Wu, Q.[Qi], Chen, J.[Jie],
Weakly-Supervised 3D Spatial Reasoning for Text-Based Visual Question Answering,
IP(32), 2023, pp. 3367-3382.
IEEE DOI 2307
Three-dimensional displays, Cognition, Solid modeling, Visualization, Optical character recognition, Task analysis, transformer BibRef

Li, Z.Y.[Zhen-Yang], Guo, Y.Y.[Yang-Yang], Wang, K.[Kejie], Wei, Y.W.[Yin-Wei], Nie, L.Q.[Li-Qiang], Kankanhalli, M.[Mohan],
Joint Answering and Explanation for Visual Commonsense Reasoning,
IP(32), 2023, pp. 3836-3846.
IEEE DOI 2307
Video recording, Visualization, Question answering (information retrieval), Task analysis, knowledge distillation BibRef

Yang, X.F.[Xiao-Feng], Lv, F.[Fengmao], Liu, F.[Fayao], Lin, G.S.[Guo-Sheng],
Self-Training Vision Language BERTs With a Unified Conditional Model,
CirSysVideo(33), No. 8, August 2023, pp. 3560-3569.
IEEE DOI 2308
Data models, Task analysis, Bidirectional control, Training, Predictive models, Bit error rate, Visualization, semi-supervised learning BibRef

Chen, L.[Long], Zheng, Y.H.[Yu-Hang], Niu, Y.[Yulei], Zhang, H.W.[Han-Wang], Xiao, J.[Jun],
Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering,
PAMI(45), No. 11, November 2023, pp. 13218-13234.
IEEE DOI 2310
BibRef

Chen, L.[Long], Yan, X., Xiao, J.[Jun], Zhang, H.W.[Han-Wang], Pu, S., Zhuang, Y.,
Counterfactual Samples Synthesizing for Robust Visual Question Answering,
CVPR20(10797-10806)
IEEE DOI 2008
Training, Cascading style sheets, Predictive models, Visualization, Image color analysis, Linguistics, Computational modeling BibRef

Wang, B.Y.[Bo-Yue], Ma, Y.J.[Yu-Jian], Li, X.Y.[Xiao-Yan], Liu, H.[Heng], Hu, Y.L.[Yong-Li], Yin, B.C.[Bao-Cai],
DSGEM: Dual scene graph enhancement module-based visual question answering,
IET-CV(17), No. 6, 2023, pp. 638-651.
DOI Link 2310
image representation, question answering (information retrieval) BibRef

Bi, Y.D.[Yan-Dong], Jiang, H.[Huajie], Zhang, H.[Hanfu], Hu, Y.L.[Yong-Li], Yin, B.C.[Bao-Cai],
Self-supervised knowledge distillation in counterfactual learning for VQA,
PRL(177), 2024, pp. 33-39.
Elsevier DOI 2401
Visual question answering, Counterfactual learning, Self-supervised learning, Language bias BibRef

Tan, S.[Sinan], Ge, M.M.[Meng-Meng], Guo, D.[Di], Liu, H.P.[Hua-Ping], Sun, F.C.[Fu-Chun],
Knowledge-Based Embodied Question Answering,
PAMI(45), No. 10, October 2023, pp. 11948-11960.
IEEE DOI 2310
BibRef

Tan, S.[Sinan], Xiang, W.L.[Wei-Lai], Liu, H.P.[Hua-Ping], Guo, D.[Di], Sun, F.C.[Fu-Chun],
Multi-agent Embodied Question Answering in Interactive Environments,
ECCV20(XIII:663-678).
Springer DOI 2011
BibRef

Mohamud, S.A.M.[Safaa Abdullahi Moallim], Jalali, A.[Amin], Lee, M.H.[Min-Ho],
Encoder-decoder cycle for visual question answering based on perception-action cycle,
PR(144), 2023, pp. 109848.
Elsevier DOI 2310
Visual question answering, Vision language tasks, Multi-modality fusion, Attention, Bilinear fusion, Brain-inspired frameworks BibRef

Tito, R.[Rubèn], Karatzas, D.[Dimosthenis], Valveny, E.[Ernest],
Hierarchical multimodal transformers for Multipage DocVQA,
PR(144), 2023, pp. 109834.
Elsevier DOI 2310
Multipage document Visual Question Answering, Document Visual Question Answering, Multipage documents, Document Intelligence BibRef

Wang, Y.X.[Ya-Xian], Wei, B.[Bifan], Liu, J.[Jun], Zhang, L.L.[Ling-Ling], Wang, J.X.[Jia-Xin], Wang, Q.Y.[Qian-Ying],
DisAVR: Disentangled Adaptive Visual Reasoning Network for Diagram Question Answering,
IP(32), 2023, pp. 4812-4827.
IEEE DOI 2310
BibRef

Han, Y.D.[Yu-Dong], Yin, J.H.[Jian-Hua], Wu, J.L.[Jian-Long], Wei, Y.W.[Yin-Wei], Nie, L.Q.[Li-Qiang],
Semantic-Aware Modular Capsule Routing for Visual Question Answering,
IP(32), 2023, pp. 5537-5549.
IEEE DOI 2310
BibRef

Qian, T.W.[Tian-Wen], Chen, J.J.[Jing-Jing], Chen, S.X.[Shao-Xiang], Wu, B.[Bo], Jiang, Y.G.[Yu-Gang],
Scene Graph Refinement Network for Visual Question Answering,
MultMed(25), 2023, pp. 3950-3961.
IEEE DOI 2310
BibRef

Qin, B.S.[Bo-Sheng], Hu, H.J.[Hao-Ji], Zhuang, Y.T.[Yue-Ting],
Deep Residual Weight-Sharing Attention Network With Low-Rank Attention for Visual Question Answering,
MultMed(25), 2023, pp. 4282-4295.
IEEE DOI Code:
WWW Link. 2310
BibRef

Zhou, S.[Sheng], Guo, D.[Dan], Li, J.[Jia], Yang, X.[Xun], Wang, M.[Meng],
Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA,
IP(32), 2023, pp. 5060-5074.
IEEE DOI 2310
BibRef

Biswas, K.[Kunal], Shivakumara, P.[Palaiahnakote], Pal, U.[Umapada], Liu, C.L.[Cheng-Lin], Lu, Y.[Yue],
VQAPT: A New visual question answering model for personality traits in social media images,
PRL(175), 2023, pp. 66-73.
Elsevier DOI 2311
Personality trait images, Multimodal concept, Text recognition, Social media images, Natural language processing, Visual question answering BibRef

Cho, J.W.[Jae Won], Argaw, D.M.[Dawit Mureja], Oh, Y.[Youngtaek], Kim, D.J.[Dong-Jin], Kweon, I.S.[In So],
Empirical study on using adapters for debiased Visual Question Answering,
CVIU(237), 2023, pp. 103842.
Elsevier DOI 2311
Visual Question Answering, Model Robustness, Biased Data, Adapters BibRef

Cho, J.W.[Jae Won], Kim, D.J.[Dong-Jin], Choi, J.[Jinsoo], Jung, Y.[Yunjae], Kweon, I.S.[In So],
Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation,
MULA21(1592-1601)
IEEE DOI 2109
Visualization, Knowledge discovery, Pattern recognition, Task analysis, Bars BibRef

Cho, J.W.[Jae Won], Kim, D.J.[Dong-Jin], Ryu, H.[Hyeonggon], Kweon, I.S.[In So],
Generative Bias for Robust Visual Question Answering,
CVPR23(11681-11690)
IEEE DOI 2309
BibRef

Liu, Y.H.[Yu-Hang], Wei, W.[Wei], Peng, D.[Daowan], Mao, X.L.[Xian-Ling], He, Z.Y.[Zhi-Yong], Zhou, P.[Pan],
Depth-Aware and Semantic Guided Relational Attention Network for Visual Question Answering,
MultMed(25), 2023, pp. 5344-5357.
IEEE DOI 2311
BibRef

Mao, A.[Aihua], Yang, Z.[Zhi], Lin, K.[Ken], Xuan, J.[Jun], Liu, Y.J.[Yong-Jin],
Positional Attention Guided Transformer-Like Architecture for Visual Question Answering,
MultMed(25), 2023, pp. 6997-7009.
IEEE DOI 2311
BibRef

Sun, H.[Hao], Wang, S.[Shu], Zhu, Y.Q.[Yun-Qiang], Yuan, W.[Wen], Zou, Z.Q.[Zhi-Qiang],
Question Classification for Intelligent Question Answering: A Comprehensive Survey,
IJGI(12), No. 10, 2023, pp. 415.
DOI Link 2311
BibRef

Cao, B.W.[Bi-Wei], Cao, J.X.[Jiu-Xin], Gui, J.[Jie], Shen, J.[Jiayun], Liu, B.[Bo], He, L.[Lei], Tang, Y.Y.[Yuan Yan], Kwok, J.T.Y.[James Tin-Yau],
AlignVE: Visual Entailment Recognition Based on Alignment Relations,
MultMed(25), 2023, pp. 7378-7387.
IEEE DOI 2311
Recognize whether the semantics of a hypothesis text can be inferred from the given premise image. BibRef

Mashrur, A.[Akib], Luo, W.[Wei], Zaidi, N.A.[Nayyar A.], Robles-Kelly, A.[Antonio],
Robust visual question answering via semantic cross modal augmentation,
CVIU(238), 2024, pp. 103862.
Elsevier DOI 2312
Visual question answering, Transformers, Multimodal learning, Model Robustness, Data augmentation BibRef

Yu, Z.[Zhou], Jin, Z.[Zitian], Yu, J.[Jun], Xu, M.L.[Ming-Liang], Wang, H.B.[Hong-Bo], Fan, J.P.[Jian-Ping],
Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering,
MultMed(25), 2023, pp. 9543-9556.
IEEE DOI 2312
BibRef

Yao, H.B.[Hai-Bo], Wang, L.P.[Li-Peng], Cai, C.T.[Cheng-Tao], Sun, Y.X.[Yu-Xin], Zhang, Z.[Zhi], Luo, Y.K.[Yong-Kang],
Multi-modal spatial relational attention networks for visual question answering,
IVC(140), 2023, pp. 104840.
Elsevier DOI 2312
Visual question answering, Spatial relation, Attention mechanism, Pre-training strategy BibRef

Huang, X.F.[Xiao-Fei], Gong, H.F.[Hong-Fang],
A Dual-Attention Learning Network With Word and Sentence Embedding for Medical Visual Question Answering,
MedImg(43), No. 2, February 2024, pp. 832-845.
IEEE DOI 2402
Feature extraction, Visualization, Medical diagnostic imaging, Data mining, Question answering (information retrieval), visual reasoning BibRef

Zheng, W.B.[Wen-Bo], Yan, L.[Lan], Wang, F.Y.[Fei-Yue],
So Many Heads, So Many Wits: Multimodal Graph Reasoning for Text-Based Visual Question Answering,
SMCS(54), No. 2, February 2024, pp. 854-865.
IEEE DOI 2402
Visualization, Cognition, Question answering (information retrieval), Feature extraction, text-based visual question answering BibRef

Bi, Y.D.[Yan-Dong], Jiang, H.[Huajie], Hu, Y.L.[Yong-Li], Sun, Y.F.[Yan-Feng], Yin, B.C.[Bao-Cai],
See and Learn More: Dense Caption-Aware Representation for Visual Question Answering,
CirSysVideo(34), No. 2, February 2024, pp. 1135-1146.
IEEE DOI 2402
Visualization, Cognition, Question answering (information retrieval), Feature extraction, cross-modal fusion BibRef

Song, Y.[Yaguang], Yang, X.S.[Xiao-Shan], Wang, Y.[Yaowei], Xu, C.S.[Chang-Sheng],
Recovering Generalization via Pre-Training-Like Knowledge Distillation for Out-of-Distribution Visual Question Answering,
MultMed(26), 2024, pp. 837-851.
IEEE DOI 2402
Data models, Training, Task analysis, Training data, Robustness, Visualization, Question answering (information retrieval), Knowledge Distillation BibRef

Wu, S.[Sen], Zhao, G.[Guoshuai], Qian, X.M.[Xue-Ming],
Resolving Zero-Shot and Fact-Based Visual Question Answering via Enhanced Fact Retrieval,
MultMed(26), 2024, pp. 1790-1800.
IEEE DOI 2402
Visualization, Task analysis, Knowledge based systems, Question answering (information retrieval), Predictive models, knowledge graph BibRef

Wen, Z.Q.[Zhi-Quan], Niu, S.C.[Shuai-Cheng], Li, G.[Ge], Wu, Q.Y.[Qing-Yao], Tan, M.K.[Ming-Kui], Wu, Q.[Qi],
Test-Time Model Adaptation for Visual Question Answering With Debiased Self-Supervisions,
MultMed(26), 2024, pp. 2137-2147.
IEEE DOI 2402
Adaptation models, Training, Visualization, Entropy, Task analysis, Question answering (information retrieval), Data models, test-time debiased self-supervised BibRef

Huai, T.Y.[Tian-Yu], Yang, S.W.[Shu-Wen], Zhang, J.H.[Jun-Hang], Zhao, J.B.[Jia-Bao], He, L.[Liang],
Debiased Visual Question Answering via the perspective of question types,
PRL(178), 2024, pp. 181-187.
Elsevier DOI 2402
Visual Question Answering, De-biasing, Self-supervised BibRef

Jiang, J.J.[Jing-Jing], Liu, Z.Y.[Zi-Yi], Zheng, N.N.[Nan-Ning],
Correlation Information Bottleneck: Towards Adapting Pretrained Multimodal Models for Robust Visual Question Answering,
IJCV(132), No. 1, January 2024, pp. 185-207.
Springer DOI 2402
BibRef

Xu, N.[Ning], Lu, Z.[Zimu], Tian, H.[Hongshuo], Kang, R.[Rongbao], Cao, J.[Jinbo], Zhang, Y.D.[Yong-Dong], Liu, A.A.[An-An],
Learning to Supervise Knowledge Retrieval Over a Tree Structure for Visual Question Answering,
MultMed(26), 2024, pp. 6689-6700.
IEEE DOI 2404
Knowledge based systems, Task analysis, Uncertainty, Visualization, Knowledge engineering, History, supervised knowledge retrieva BibRef

Pan, Y.H.[Yong-Hua], Liu, J.[Jing], Jin, L.[Lu], Li, Z.C.[Ze-Chao],
Unbiased Visual Question Answering by Leveraging Instrumental Variable,
MultMed(26), 2024, pp. 6648-6662.
IEEE DOI 2404
Visualization, Correlation, Instruments, Training, Predictive models, Color, Generators, Visual question answering, out of distribution BibRef

Zhang, S.[Siyu], Chen, Y.[Yeming], Sun, Y.[Yaoru], Wang, F.[Fang], Shi, H.B.[Hai-Bo], Wang, H.R.[Hao-Ran],
LOIS: Looking Out of Instance Semantics for Visual Question Answering,
MultMed(26), 2024, pp. 6202-6214.
IEEE DOI 2404
Visualization, Semantics, Task analysis, Feature extraction, Question answering (information retrieval), Cognition, Detectors, multimodal relation attention BibRef

Xie, J.Y.[Jia-Yuan], Cai, Y.[Yi], Chen, J.L.[Jia-Li], Xu, R.[Ruohang], Wang, J.[Jiexin], Li, Q.[Qing],
Knowledge-Augmented Visual Question Answering With Natural Language Explanation,
IP(33), 2024, pp. 2652-2664.
IEEE DOI Code:
WWW Link. 2404
Task analysis, Visualization, Feature extraction, Question answering (information retrieval), Iterative methods, multimodal BibRef

Hu, Z.J.[Zhong-Jian], Yang, P.[Peng], Jiang, Y.S.[Yuan-Shuang], Bai, Z.J.[Zi-Jian],
Prompting large language model with context and pre-answer for knowledge-based VQA,
PR(151), 2024, pp. 110399.
Elsevier DOI 2404
Visual question answering, Large language model, Knowledge-based VQA, Fine-tuning, In-context learning BibRef

Wang, Q.[Qunbo], Liu, J.[Jing], Wu, W.J.[Wen-Jun],
Coordinating explicit and implicit knowledge for knowledge-based VQA,
PR(151), 2024, pp. 110368.
Elsevier DOI 2404
Pre-trained model, Knowledge-based VQA, Knowledge retrieval BibRef

Wei, M.[Meng], Chen, L.[Long], Ji, W.[Wei], Yue, X.Y.[Xiao-Yu], Zimmermann, R.[Roger],
In Defense of Clip-Based Video Relation Detection,
IP(33), 2024, pp. 2759-2769.
IEEE DOI 2404
Context modeling, Visualization, Proposals, Image coding, Trajectory, Training, hierarchical context modeling BibRef

Ma, J.[Jie], Liu, J.[Jun], Chai, Q.[Qi], Wang, P.H.[Ping-Hui], Tao, J.[Jing],
Diagram Perception Networks for Textbook Question Answering via Joint Optimization,
IJCV(132), No. 5, May 2024, pp. 1578-1591.
Springer DOI 2405
BibRef

Wang, J.[Junjue], Ma, A.[Ailong], Chen, Z.H.[Zi-Hang], Zheng, Z.[Zhuo], Wan, Y.T.[Yu-Ting], Zhang, L.P.[Liang-Pei], Zhong, Y.F.[Yan-Fei],
EarthVQANet: Multi-task visual question answering for remote sensing image understanding,
PandRS(212), 2024, pp. 422-439.
Elsevier DOI Code:
HTML Version. 2406
Visual question answering, Semantic segmentation, Multi-modal fusion, Multi-task learning, Knowledge reasoning BibRef

Uehara, K.[Kohei], Harada, T.[Tatsuya],
Learning by Asking Questions for Knowledge-Based Novel Object Recognition,
IJCV(132), No. 6, June 2024, pp. 2290-2309.
Springer DOI 2406
BibRef
Earlier:
K-VQG: Knowledge-aware Visual Question Generation for Common-sense Acquisition,
WACV23(4390-4398)
IEEE DOI 2302
Recognize novel objects. Learning systems, Visualization, Knowledge acquisition, Benchmark testing, Task analysis, visual reasoning) BibRef

Uehara, K.[Kohei], Duan, N.[Nan], Harada, T.[Tatsuya],
Learning to Ask Informative Sub-Questions for Visual Question Answering,
MULA22(4680-4689)
IEEE DOI 2210
Training, Visualization, Computational modeling, Reinforcement learning, Predictive models BibRef

Li, Y.K.[Yi-Kang], Duan, N.[Nan], Zhou, B.L.[Bo-Lei], Chu, X.[Xiao], Ouyang, W.L.[Wan-Li], Wang, X.G.[Xiao-Gang], Zhou, M.[Ming],
Visual Question Generation as Dual Task of Visual Question Answering,
CVPR18(6116-6124)
IEEE DOI 1812
Task analysis, Visualization, Knowledge discovery, Training, Computational modeling BibRef

Gao, P.[Peng], Li, H.S.[Hong-Sheng], Li, S.[Shuang], Lu, P.[Pan], Li, Y.K.[Yi-Kang], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang],
Question-Guided Hybrid Convolution for Visual Question Answering,
ECCV18(I: 485-501).
Springer DOI 1810
BibRef

Gao, P.[Peng], Jiang, Z.K.[Zheng-Kai], You, H.X.[Hao-Xuan], Lu, P.[Pan], Hoi, S.C.H.[Steven C. H.], Wang, X.G.[Xiao-Gang], Li, H.S.[Hong-Sheng],
Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering,
CVPR19(6632-6641).
IEEE DOI 2002
BibRef

Wang, J.[Jialou], Zhu, M.[Manli], Li, Y.[Yulei], Li, H.L.[Hong-Lei], Yang, L.Z.[Long-Zhi], Woo, W.L.[Wai Lok],
Detect2Interact: Localizing Object Key Field in Visual Question Answering with LLMs,
IEEE_Int_Sys(39), No. 3, May 2024, pp. 35-44.
IEEE DOI 2407
Visualization, Semantics, Object detection, Image segmentation, Task analysis, Computational modeling, Chatbots, Spatial resolution BibRef

Qian, S.[Shun], Liu, B.Q.[Bing-Quan], Sun, C.J.[Cheng-Jie], Xu, Z.[Zhen], Ma, L.[Lin], Wang, B.[Baoxun],
CroMIC-QA: The Cross-Modal Information Complementation Based Question Answering,
MultMed(26), 2024, pp. 8348-8359.
IEEE DOI 2408
Task analysis, Visualization, Semantics, Crops, Question answering (information retrieval), Diseases, multi-modal tasks BibRef

Li, L.J.[Lin-Jun], Jin, T.[Tao], Lin, W.[Wang], Jiang, H.[Hao], Pan, W.W.[Wen-Wen], Wang, J.[Jian], Xiao, S.W.[Shu-Wen], Xia, Y.[Yan], Jiang, W.H.[Wei-Hao], Zhao, Z.[Zhou],
Multi-Granularity Relational Attention Network for Audio-Visual Question Answering,
CirSysVideo(34), No. 8, August 2024, pp. 7080-7094.
IEEE DOI 2408
Visualization, Question answering (information retrieval), Labeling, Manuals, Electronic commerce, Task analysis, Cognition, e-commerce dataset BibRef

Vosoughi, A.[Ali], Deng, S.J.[Shi-Jian], Zhang, S.Y.[Song-Yang], Tian, Y.[Yapeng], Xu, C.L.[Chen-Liang], Luo, J.B.[Jie-Bo],
Cross Modality Bias in Visual Question Answering: A Causal View With Possible Worlds VQA,
MultMed(26), 2024, pp. 8609-8624.
IEEE DOI 2408
Visualization, Faces, Training, Linguistics, Cultural differences, Question answering (information retrieval), Cognition, visual question answering (VQA) BibRef


Liu, X.L.[Xiu-Long], Dong, Z.K.[Zhi-Kang], Zhang, P.[Peng],
Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering,
WACV24(4466-4475)
IEEE DOI Code:
WWW Link. 2404
Reviews, Computational modeling, Benchmark testing, Task analysis, Videos, Algorithms, Datasets and evaluations, Algorithms, Vision + language and/or other modalities BibRef

Shi, X.X.[Xiang-Xi], Lee, S.[Stefan],
Benchmarking Out-of-Distribution Detection in Visual Question Answering,
WACV24(5473-5483)
IEEE DOI 2404
Visualization, Computational modeling, Estimation, Benchmark testing, Predictive models, Feature extraction, Vision + language and/or other modalities BibRef

Venkataraman, S.R.[Sai Raam], Rao, R.S.[Rishi Sridhar], Balasubramanian, S., Sarma, R.R.[R. Raghunatha], Vorugunti, C.S.[Chandra Sekhar],
Can you even tell left from right? Presenting a new challenge for VQA,
WACV24(4486-4495)
IEEE DOI 2404
Training, Visualization, Games, Benchmark testing, Cognition, Question answering (information retrieval), Algorithms, Vision + language and/or other modalities BibRef

Sahu, P.P.[Pragya Paramita], Raut, A.[Abhishek], Samant, J.S.[Jagdish Singh], Gorijala, M.[Mahesh], Lakshminarayanan, V.[Vignesh], Bhaskar, P.[Pinaki],
POP-VQA: Privacy preserving, On-device, Personalized Visual Question Answering,
WACV24(8455-8464)
IEEE DOI 2404
Training, Visualization, Privacy, Biological system modeling, Computational modeling, System performance, Vision + language and/or other modalities BibRef

Li, J.P.[Jia-Peng], Wei, P.[Ping], Han, W.J.[Wen-Juan], Fan, L.F.[Li-Feng],
IntentQA: Context-aware Video Intent Reasoning,
ICCV23(11929-11940)
IEEE DOI Code:
WWW Link. 2401
BibRef

Hu, Y.S.[Yu-Shi], Hua, H.[Hang], Yang, Z.Y.[Zheng-Yuan], Shi, W.J.[Wei-Jia], Smith, N.A.[Noah A.], Luo, J.B.[Jie-Bo],
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3,
ICCV23(2951-2963)
IEEE DOI 2401
BibRef

Reichman, B.[Benjamin], Heck, L.[Larry],
Cross-Modal Dense Passage Retrieval for Outside Knowledge Visual Question Answering,
CLVL23(2829-2834)
IEEE DOI 2401
BibRef

Naik, N.[Nandita], Potts, C.[Christopher], Kreiss, E.[Elisa],
Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering,
CLVL23(2813-2817)
IEEE DOI 2401
BibRef

Hu, Y.S.[Yu-Shi], Liu, B.[Benlin], Kasai, J.[Jungo], Wang, Y.Z.[Yi-Zhong], Ostendorf, M.[Mari], Krishna, R.[Ranjay], Smith, N.A.[Noah A.],
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering,
ICCV23(20349-20360)
IEEE DOI 2401
BibRef

Zhang, Y.W.[Yu-Wei], Ho, C.H.[Chih-Hui], Vasconcelos, N.M.[Nuno M.],
Toward Unsupervised Realistic Visual Question Answering,
ICCV23(15567-15578)
IEEE DOI Code:
WWW Link. 2401
BibRef

Liang, K.[Kaiqu], Albanie, S.[Samuel],
Simple Baselines for Interactive Video Retrieval with Questions and Answers,
ICCV23(11057-11067)
IEEE DOI Code:
WWW Link. 2401
BibRef

Mensink, T.[Thomas], Uijlings, J.[Jasper], Castrejon, L.[Lluis], Goel, A.[Arushi], Cadar, F.[Felipe], Zhou, H.[Howard], Sha, F.[Fei], Araujo, A.[André], Ferrari, V.[Vittorio],
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories,
ICCV23(3090-3101)
IEEE DOI 2401
BibRef

Qian, Z.[Zi], Wang, X.[Xin], Duan, X.G.[Xu-Guang], Qin, P.[Pengda], Li, Y.H.[Yu-Hong], Zhu, W.W.[Wen-Wu],
Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering,
ICCV23(2941-2950)
IEEE DOI 2401
BibRef

Xue, D.[Dizhan], Qian, S.S.[Sheng-Sheng], Xu, C.S.[Chang-Sheng],
Variational Causal Inference Network for Explanatory Visual Question Answering,
ICCV23(2515-2525)
IEEE DOI 2401
BibRef

Bruni, P.[Pierfrancesco], Falcon, A.[Alex], Radeva, P.[Petia],
Time-aware Circulant Matrices for Question-based Temporal Localization,
CIAP23(II:182-195).
Springer DOI 2312
BibRef

Ferreira, B.C.L.[Bruno Carlos Luís], Oliveira, H.G.[Hugo Gonçalo], Silva, C.[Catarina],
Leveraging Question Answering for Domain-Agnostic Information Extraction,
CIARP23(I:244-256).
Springer DOI 2312
BibRef

Wu, Z.H.[Zi-Heng], Shu, X.Y.[Xin-Yao], Yan, S.Y.[Shi-Yang], Lu, Z.Y.[Zhen-Yu],
FGCVQA: Fine-Grained Cross-Attention for Medical VQA,
ICIP23(975-979)
IEEE DOI Code:
WWW Link. 2312
BibRef

Zhu, H.[He], Togo, R.[Ren], Ogawa, T.[Takahiro], Haseyama, M.[Miki],
Interpretable Visual Question Answering Referring to Outside Knowledge,
ICIP23(2140-2144)
IEEE DOI 2312
BibRef

Parelli, M.[Maria], Mallis, D.[Dimitrios], Diomataris, M.[Markos], Pitsikalis, V.[Vassilis],
Interpretable Visual Question Answering Via Reasoning Supervision,
ICIP23(2525-2529)
IEEE DOI 2312
BibRef

Hegde, S.[Shamanthak], Jahagirdar, S.[Soumya], Gangisetty, S.[Shankar],
Making the V in Text-VQA Matter,
ODRUM23(5580-5588)
IEEE DOI 2309
BibRef

Suo, W.[Wei], Sun, M.Y.[Meng-Yang], Liu, W.S.[Wei-Song], Gao, Y.Q.[Yi-Qi], Wang, P.[Peng], Zhang, Y.N.[Yan-Ning], Wu, Q.[Qi],
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning,
CVPR23(2646-2656)
IEEE DOI 2309
BibRef

Alampalle, C.[Charani], Hegde, S.[Shamanthak], Jahagirdar, S.[Soumya], Gangisetty, S.[Shankar],
Weakly Supervised Visual Question Answer Generation,
ODRUM23(5589-5597)
IEEE DOI 2309
BibRef

Jiang, J.J.[Jing-Jing], Zheng, N.N.[Nan-Ning],
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering,
CVPR23(24203-24213)
IEEE DOI 2309
BibRef

Wang, Y.[Ying], Pfeiffer, J.[Jonas], Carion, N.[Nicolas], Le Cun, Y.L.[Yann L.], Kamath, A.[Aishwarya],
Adapting Grounded Visual Question Answering Models to Low Resource Languages,
MULA23(2596-2605)
IEEE DOI 2309
BibRef

Wang, M.[Min], Mahjoubfar, A.[Ata], Joshi, A.[Anupama],
FashionVQA: A Domain-Specific Visual Question Answering System,
CVFAD23(3514-3519)
IEEE DOI 2309
BibRef

Shao, Z.W.[Zhen-Wei], Yu, Z.[Zhou], Wang, M.[Meng], Yu, J.[Jun],
Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering,
CVPR23(14974-14983)
IEEE DOI 2309
BibRef

Tascon-Morales, S.[Sergio], Márquez-Neila, P.[Pablo], Sznitman, R.[Raphael],
Logical Implications for Visual Question Answering Consistency,
CVPR23(6725-6735)
IEEE DOI 2309
BibRef

Chen, S.[Shi], Zhao, Q.[Qi],
Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning,
CVPR23(6736-6745)
IEEE DOI 2309
BibRef

Guo, J.X.[Jia-Xian], Li, J.[Junnan], Li, D.X.[Dong-Xu], Tiong, A.M.H.[Anthony Meng Huat], Li, B.Y.[Bo-Yang], Tao, D.C.[Da-Cheng], Hoi, S.[Steven],
From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models,
CVPR23(10867-10877)
IEEE DOI 2309
BibRef

Basu, A.[Abhipsa], Addepalli, S.[Sravanti], Babu, R.V.[R. Venkatesh],
RMLVQA: A Margin Loss Approach For Visual Question Answering with Language Biases,
CVPR23(11671-11680)
IEEE DOI 2309
BibRef

Li, B.J.[Bing-Jia], Wang, J.[Jie], Zhao, M.[Minyi], Zhou, S.[Shuigeng],
Two-stage Multimodality Fusion for High-performance Text-based Visual Question Answering,
ACCV22(IV:658-674).
Springer DOI 2307
BibRef

Vivoli, E.[Emanuele], Biten, A.F.[Ali Furkan], Mafla, A.[Andres], Karatzas, D.[Dimosthenis], Gomez, L.[Lluis],
MUST-VQA: Multilingual Scene-Text VQA,
TextEvery22(345-358).
Springer DOI 2304
BibRef

Chai, Z.[Zi], Wan, X.J.[Xiao-Jun], Han, S.C.[Soyeon Caren], Poon, J.[Josiah],
Visual Question Generation Under Multi-granularity Cross-Modal Interaction,
MMMod23(I: 255-266).
Springer DOI 2304
BibRef

Wang, J.H.[Jiang-Hai], Hu, M.H.[Meng-Hao], Song, Y.G.[Ya-Guang], Yang, X.S.[Xiao-Shan],
Health-Oriented Multimodal Food Question Answering,
MMMod23(I: 191-203).
Springer DOI 2304
BibRef

Bongini, P.[Pietro], Becattini, F.[Federico], del Bimbo, A.[Alberto],
Is GPT-3 All You Need for Visual Question Answering in Cultural Heritage?,
VisArt22(268-281).
Springer DOI 2304
BibRef

Jha, A.[Abhishek], Patro, B.[Badri], Van Gool, L.J.[Luc J.], Tuytelaars, T.[Tinne],
Barlow constrained optimization for Visual Question Answering,
WACV23(1084-1093)
IEEE DOI 2302
Training, Visualization, Computational modeling, Redundancy, Semantics, Minimization, visual reasoning. BibRef

Ravi, S.[Sahithya], Chinchure, A.[Aditya], Sigal, L.[Leonid], Liao, R.J.[Ren-Jie], Shwartz, V.[Vered],
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge,
WACV23(1155-1165)
IEEE DOI 2302
Comets, Visualization, Analytical models, Knowledge based systems, Linguistics, Transformers, visual reasoning) BibRef

Etesam, Y.[Yasaman], Kochiev, L.[Leon], Chang, A.X.[Angel X.],
3DVQA: Visual Question Answering for 3D Environments,
CRV22(233-240)
IEEE DOI 2301
Point cloud compression, Surface reconstruction, Lighting, Question answering (information retrieval), Noise measurement, 3D BibRef

Ramamurthy, P.[Priyadharsini], Aakur, S.N.[Sathyanarayanan N.],
ISD-QA: Iterative Distillation of Commonsense Knowledge from General Language Models for Unsupervised Question Answering,
ICPR22(1229-1235)
IEEE DOI 2212
Transfer learning, Training data, Question answering (information retrieval), Data models, Iterative methods BibRef

Zhang, H.T.[Hao-Tian], Wu, W.[Wei],
CAT: Re-Conv Attention in Transformer for Visual Question Answering,
ICPR22(1471-1477)
IEEE DOI 2212
Representation learning, Visualization, Predictive models, Performance gain, Transformers, Feature extraction, Multi-modal task BibRef

Liu, L.[Lei], Su, X.D.[Xiang-Dong], Guo, H.[Hui], Zhu, D.[Daobin],
A Transformer-based Medical Visual Question Answering Model,
ICPR22(1712-1718)
IEEE DOI 2212
Training, Visualization, Transformers, Feature extraction, Question answering (information retrieval), Stability analysis, Data mining BibRef

Wu, X.Y.[Xiang-Yu], Lu, J.F.[Jian-Feng], Li, Z.F.[Zhuan-Feng], Xiong, F.C.[Feng-Chao],
Ques-to-Visual Guided Visual Question Answering,
ICIP22(4193-4197)
IEEE DOI 2211
Location awareness, Visualization, Fuses, Semantics, Benchmark testing, Question answering (information retrieval), channel attention BibRef

Sarkar, A.[Argho], Rahnemoonfar, M.[Maryam],
Grad-Cam Aware Supervised Attention for Visual Question Answering for Post-Disaster Damage Assessment,
ICIP22(3783-3787)
IEEE DOI 2211
Training, Visualization, Annotations, Pipelines, Question answering (information retrieval), Hurricanes, Grad-Cam BibRef

Whitehead, S.[Spencer], Petryk, S.[Suzanne], Shakib, V.[Vedaad], Gonzalez, J.[Joseph], Darrell, T.J.[Trevor J.], Rohrbach, A.[Anna], Rohrbach, M.[Marcus],
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly,
ECCV22(XXXVI:148-166).
Springer DOI 2211
BibRef

Chen, L.[Long], Zheng, Y.H.[Yu-Hang], Xiao, J.[Jun],
Rethinking Data Augmentation for Robust Visual Question Answering,
ECCV22(XXXVI:95-112).
Springer DOI 2211
BibRef

Zhang, H.T.[Hao-Tian], Wu, W.[Wei],
Context Relation Fusion Model for Visual Question Answering,
ICIP22(2112-2116)
IEEE DOI 2211
Visualization, Question answering (information retrieval), Task analysis, Context modeling, Visual question answering, language bias BibRef

Biten, A.F.[Ali Furkan], Litman, R.[Ron], Xie, Y.S.[Yu-Sheng], Appalaraju, S.[Srikar], Manmatha, R.,
LaTr: Layout-Aware Transformer for Scene-Text VQA,
CVPR22(16527-16537)
IEEE DOI 2210
Training, Symbiosis, Visualization, Vocabulary, Layout, Transformers, Feature extraction, Vision + language, Scene analysis and understanding BibRef

Lu, J.Y.[Jia-Ying], Ye, X.[Xin], Ren, Y.[Yi], Yang, Y.Z.[Ye-Zhou],
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning,
ODRUM22(4917-4926)
IEEE DOI 2210
Training, Visualization, Computational modeling, Knowledge based systems, Training data, Reinforcement learning, Data models BibRef

Ding, Y.H.[Yi-Hao], Huang, Z.[Zhe], Wang, R.[Runlin], Zhang, Y.H.[Yan-Hang], Chen, X.[Xianru], Ma, Y.Z.[Yu-Zhong], Chung, H.[Hyunsuk], Han, S.C.[Soyeon Caren],
V-Doc: Visual questions answers with Documents,
CVPR22(21460-21466)
IEEE DOI 2210
Deep learning, Visualization, Computational modeling, Predictive models, Portable document format, Question answering (information retrieval) BibRef

Azuma, D.[Daichi], Miyanishi, T.[Taiki], Kurita, S.H.[Shu-Hei], Kawanabe, M.[Motoaki],
ScanQA: 3D Question Answering for Spatial Scene Understanding,
CVPR22(19107-19117)
IEEE DOI 2210
Location awareness, Measurement, Solid modeling, Visualization, Question answering (information retrieval), Vision + language, Scene analysis and understanding BibRef

Li, G.Y.[Guang-Yao], Wei, Y.[Yake], Tian, Y.[Yapeng], Xu, C.L.[Chen-Liang], Wen, J.R.[Ji-Rong], Hu, D.[Di],
Learning to Answer Questions in Dynamic Audio-Visual Scenarios,
CVPR22(19086-19096)
IEEE DOI 2210
Visualization, Image analysis, Codes, Computational modeling, Cognition, Question answering (information retrieval), Vision + language BibRef

Chen, C.[Chongyan], Anjum, S.[Samreen], Gurari, D.[Danna],
Grounding Answers for Visual Questions Asked by Visually Impaired People,
CVPR22(19076-19085)
IEEE DOI 2210
Visualization, Correlation, Grounding, Text recognition, Computational modeling, Visual impairment, Vision + language BibRef

Jing, C.C.[Chen-Chen], Jia, Y.D.[Yun-De], Wu, Y.W.[Yu-Wei], Liu, X.Y.[Xin-Yu], Wu, Q.[Qi],
Maintaining Reasoning Consistency in Compositional Visual Question Answering,
CVPR22(5089-5098)
IEEE DOI 2210
Visualization, Birds, Cognition, Question answering (information retrieval), Visual reasoning BibRef

Cascante-Bonilla, P.[Paola], Wu, H.[Hui], Wang, L.[Letao], Feris, R.S.[Rogerio S.], Ordonez, V.[Vicente],
Sim VQA: Exploring Simulated Environments for Visual Question Answering,
CVPR22(5046-5056)
IEEE DOI 2210
Training, Visualization, Solid modeling, Computational modeling, Pipelines, Switches, Vision + language, Visual reasoning BibRef

Gupta, V.[Vipul], Li, Z.W.[Zhuo-Wan], Kortylewski, A.[Adam], Zhang, C.Y.[Chen-Yu], Li, Y.W.[Ying-Wei], Yuille, A.L.[Alan L.],
SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering,
CVPR22(5068-5078)
IEEE DOI 2210
Training, Visualization, Perturbation methods, Computational modeling, Predictive models, Robustness, Visual reasoning BibRef

Burghouts, G.J.[Gertjan J.], Huizinga, W.[Wyke],
Coarse-to-Fine Visual Question Answering by Iterative, Conditional Refinement,
CIAP22(II:418-428).
Springer DOI 2205
BibRef

Kant, Y.[Yash], Moudgil, A.[Abhinav], Batra, D.[Dhruv], Parikh, D.[Devi], Agrawal, H.[Harsh],
Contrast and Classify: Training Robust VQA Models,
ICCV21(1584-1593)
IEEE DOI 2203
Training, Visualization, Perturbation methods, Linguistics, Benchmark testing, Boosting, Vision + language, BibRef

Han, X.Z.[Xin-Zhe], Wang, S.H.[Shu-Hui], Su, C.[Chi], Huang, Q.M.[Qing-Ming], Tian, Q.[Qi],
Greedy Gradient Ensemble for Robust Visual Question Answering,
ICCV21(1564-1573)
IEEE DOI 2203
Visualization, Analytical models, Annotations, Computational modeling, Feature extraction, Data models, BibRef

Dancette, C.[Corentin], Cadène, R.[Rémi], Teney, D.[Damien], Cord, M.[Matthieu],
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering,
ICCV21(1554-1563)
IEEE DOI 2203
Training, Visualization, Protocols, Codes, Image color analysis, Computational modeling, Vision + language, Explainable AI, Visual reasoning and logical representation BibRef

Zhou, Y.[Yiyi], Ren, T.[Tianhe], Zhu, C.Y.[Chao-Yang], Sun, X.S.[Xiao-Shuai], Liu, J.Z.[Jian-Zhuang], Ding, X.H.[Xing-Hao], Xu, M.L.[Ming-Liang], Ji, R.R.[Rong-Rong],
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering,
ICCV21(2054-2064)
IEEE DOI 2203
Visualization, Schedules, Computational modeling, Transforms, Benchmark testing, Performance gain, Transformers, BibRef

Yang, X.[Xu], Gao, C.Y.[Chong-Yang], Zhang, H.W.[Han-Wang], Cai, J.F.[Jian-Fei],
Auto-Parsing Network for Image Captioning and Visual Question Answering,
ICCV21(2177-2187)
IEEE DOI 2203
Training, Visualization, Graphical models, Stacking, Probability, Transformers, Vision + language, BibRef

Banerjee, P.[Pratyay], Gokhale, T.[Tejas], Yang, Y.Z.[Ye-Zhou], Baral, C.[Chitta],
Weakly Supervised Relative Spatial Reasoning for Visual Question Answering,
ICCV21(1888-1898)
IEEE DOI 2203
Geometry, Visualization, Grounding, Semantics, Estimation, Predictive models, Vision + language, Visual reasoning and logical representation BibRef

Li, L.J.[Lin-Jie], Lei, J.[Jie], Gan, Z.[Zhe], Liu, J.J.[Jing-Jing],
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models,
ICCV21(2022-2031)
IEEE DOI 2203
Training, Visualization, Analytical models, Computational modeling, Benchmark testing, Robustness, Vision + language, BibRef

Askarian, N.[Narjes], Abbasnejad, E.[Ehsan], Zukerman, I.[Ingrid], Buntine, W.[Wray], Haffari, G.[Gholamreza],
Inductive Biases for Low Data VQA: A Data Augmentation Approach,
Novelty22(231-240)
IEEE DOI 2202
Training, Visualization, Conferences, Natural languages, Image annotation, Data models BibRef

Mathew, M.[Minesh], Bagal, V.[Viraj], Tito, R.[Rubèn], Karatzas, D.[Dimosthenis], Valveny, E.[Ernest], Jawahar, C.V.,
InfographicVQA,
WACV22(2582-2591)
IEEE DOI 2202
Visualization, Computational modeling, Layout, Data visualization, Benchmark testing, Brain modeling, Vision and Languages BibRef

Kumar, S.[Sumit], Patro, B.N.[Badri N.], Namboodiri, V.P.[Vinay P.],
Auto QA: The Question Is Not Only What, but Also Where,
Novelty22(272-281)
IEEE DOI 2202
Location awareness, Visualization, Laser radar, Conferences, Semantics, Sensor systems BibRef

Kolling, C.[Camila], More, M.[Martin], Gavenski, N.[Nathan], Pooch, E.[Eduardo], Parraga, O.[Otávio], Barros, R.C.[Rodrigo C.],
Efficient Counterfactual Debiasing for Visual Question Answering,
WACV22(2572-2581)
IEEE DOI 2202
Training, Visualization, Frequency synthesizers, Correlation, Grounding, Computational modeling, Synthesizers, Analysis and Understanding BibRef

Jung, S.J.[Seung-Jun], Byun, J.Y.[Jun-Young], Shim, K.[Kyujin], Hwang, S.Y.[Sangh-Yun], Kim, C.[Changick],
Understanding VQA for Negative Answers Through Visual and Linguistic Inference,
ICIP21(2873-2877)
IEEE DOI 2201
Visualization, Image processing, Linguistics, Knowledge discovery, Inference algorithms, Reliability, Image Captioning, Constrained Beam Search BibRef

Felix, R.[Rafael], Repasky, B.[Boris], Hodge, S.[Samuel], Zolfaghari, R.[Reza], Abbasnejad, E.[Ehsan], Sherrah, J.[Jamie],
Cross-Modal Visual Question Answering for Remote Sensing Data: the International Conference on Digital Image Computing: Techniques and Applications (DICTA 2021),
DICTA21(1-9)
IEEE DOI 2201
Earth, Visualization, Satellites, Digital images, Natural languages, Machine learning, Transformers, Visual Question Answering, OpenStreetMap BibRef

Le, T.[Tung], Nguyen, H.T.[Huy Tien], Nguyen, M.L.[Minh Le],
Vision and Text Transformer for Predicting Answerability on Visual Question Answering,
ICIP21(934-938)
IEEE DOI 2201
Visualization, Image processing, Predictive models, Knowledge discovery, Robustness, Task analysis, Answerability, Multi-head Attention BibRef

Huang, Z.Q.[Zi-Qi], Zhu, H.Y.[Hong-Yuan], Sun, Y.[Ying], Choi, D.[Dongkyu], Tan, C.[Cheston], Lim, J.H.[Joo-Hwee],
A Diagnostic Study of Visual Question Answering With Analogical Reasoning,
ICIP21(2463-2467)
IEEE DOI 2201
Location awareness, Visualization, Image processing, Natural languages, Benchmark testing, Tools, Knowledge discovery, benchmark BibRef

Chen, H.Y.[Hong-Yu], Liu, R.F.[Rui-Fang], Peng, B.[Bo],
Cross-modal Relational Reasoning Network for Visual Question Answering,
MAIR2-21(3939-3948)
IEEE DOI 2112
Bridges, Visualization, Semantics, Knowledge discovery, Linear programming BibRef

Wang, Z.X.[Zi-Xu], Miao, Y.[Yishu], Specia, L.[Lucia],
Latent Variable Models for Visual Question Answering,
CLVL21(3137-3141)
IEEE DOI 2112
Training, Visualization, Computer aided instruction, Benchmark testing, Knowledge discovery BibRef

Hirota, Y.[Yusuke], Garcia, N.[Noa], Otani, M.[Mayu], Chu, C.[Chenhui], Nakashima, Y.[Yuta], Taniguchi, I.[Ittetsu], Onoye, T.[Takao],
Visual Question Answering with Textual Representations for Images,
CLVL21(3147-3150)
IEEE DOI 2112
Visualization, Computational modeling, Knowledge discovery, Feature extraction, Object recognition BibRef

Ye, K.[Keren], Kovashka, A.[Adriana],
Linguistic Structures as Weak Supervision for Visual Scene Graph Generation,
CVPR21(8285-8295)
IEEE DOI 2111
Location awareness, Visualization, Blogs, Linguistics, Pattern recognition, Noise measurement BibRef

Xiao, J.B.[Jun-Bin], Shang, X.[Xindi], Yao, A.[Angela], Chua, T.S.[Tat-Seng],
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions,
CVPR21(9772-9781)
IEEE DOI 2111
Adaptation models, Benchmark testing, Knowledge discovery, Cognition, Pattern recognition, Task analysis BibRef

Chen, X.Y.[Xian-Yu], Jiang, M.[Ming], Zhao, Q.[Qi],
Predicting Human Scanpaths in Visual Question Answering,
CVPR21(10871-10880)
IEEE DOI 2111
Training, Visualization, Reinforcement learning, Predictive models, Tools, Knowledge discovery BibRef

Qi, Y.G.[Yong-Gang], Zhang, K.[Kai], Sain, A.[Aneeshan], Song, Y.Z.[Yi-Zhe],
PQA: Perceptual Question Answering,
CVPR21(12051-12059)
IEEE DOI 2111
Visualization, Training data, Psychology, Organizations, Visual systems, Knowledge discovery, Data models BibRef

Yuan, Y.Y.[Yuan-Yuan], Wang, S.[Shuai], Jiang, M.Y.[Ming-Yue], Chen, T.Y.[Tsong Yueh],
Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing,
CVPR21(16903-16912)
IEEE DOI 2111
Visualization, Computational modeling, Transforms, Benchmark testing, Knowledge discovery, Cognition BibRef

Marino, K.[Kenneth], Chen, X.L.[Xin-Lei], Parikh, D.[Devi], Gupta, A.[Abhinav], Rohrbach, M.[Marcus],
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA,
CVPR21(14106-14116)
IEEE DOI 2111
Training, Vocabulary, Knowledge based systems, Semantics, Training data, Knowledge representation, Predictive models BibRef

Niu, Y.[Yulei], Tang, K.[Kaihua], Zhang, H.W.[Han-Wang], Lu, Z.W.[Zhi-Wu], Hua, X.S.[Xian-Sheng], Wen, J.R.[Ji-Rong],
Counterfactual VQA: A Cause-Effect Look at Language Bias,
CVPR21(12695-12705)
IEEE DOI 2111
Codes, Linguistics, Robustness, Cognition, Pattern recognition BibRef

Yang, Z.Y.[Zheng-Yuan], Lu, Y.J.[Yi-Juan], Wang, J.F.[Jian-Feng], Yin, X.[Xi], Florencio, D.[Dinei], Wang, L.J.[Li-Juan], Zhang, C.[Cha], Zhang, L.[Lei], Luo, J.B.[Jie-Bo],
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption,
CVPR21(8747-8757)
IEEE DOI 2111
Visualization, Training data, Predictive models, Knowledge discovery, Pattern recognition, Optical character recognition software BibRef

Kervadec, C.[Corentin], Jaunet, T.[Théo], Antipov, G.[Grigory], Baccouche, M.[Moez], Vuillemot, R.[Romain], Wolf, C.[Christian],
How Transferable are Reasoning Patterns in VQA?,
CVPR21(4205-4214)
IEEE DOI 2111
Visualization, Analytical models, Data visualization, Tools, Transformers, Cognition, Data models BibRef

Kervadec, C.[Corentin], Antipov, G.[Grigory], Baccouche, M.[Moez], Wolf, C.[Christian],
Roses are Red, Violets are Blue… But Should VQA expect Them To?,
CVPR21(2775-2784)
IEEE DOI 2111
Training, Measurement, Visualization, Computational modeling, Benchmark testing, Knowledge discovery BibRef

Dua, R.[Radhika], Kancheti, S.S.[Sai Srinivas], Balasubramanian, V.N.[Vineeth N],
Beyond VQA: Generating Multi-word Answers and Rationales to Visual Questions,
MULA21(1623-1632)
IEEE DOI 2109
Deep learning, Visualization, Vocabulary, Computational modeling, Knowledge discovery BibRef

Rahman, T.[Tanzila], Chou, S.H.[Shih-Han], Sigal, L.[Leonid], Carenini, G.[Giuseppe],
An Improved Attention for Visual Question Answering,
MULA21(1653-1662)
IEEE DOI 2109
Visualization, Computational modeling, Natural languages, Logic gates BibRef

Jolly, S.[Shailza], Palacio, S.[Sebastian], Folz, J.[Joachim], Raue, F.[Federico], Hees, J.[Jörn], Dengel, A.[Andreas],
P ˜ NP, at least in Visual Question Answering,
ICPR21(2748-2754)
IEEE DOI 2105
Training, Visualization, Upper bound, Knowledge discovery, Pattern recognition BibRef

Farazi, M.[Moshiur], Khan, S.[Salman], Barnes, N.M.[Nick M.],
Question-Agnostic Attention for Visual Question Answering,
ICPR21(3542-3549)
IEEE DOI 2105
Training, Visualization, Image resolution, Preforms, Computational modeling, Semantics, Focusing, Multimodal Fusion BibRef

Li, Y.[Yanan], Lin, Y.[Yuetan], Zhao, H.H.[Hong-Hui], Wang, D.H.[Dong-Hui],
Dual Path Multi-Modal High-Order Features for Textual Content based Visual Question Answering,
ICPR21(4324-4331)
IEEE DOI 2105
Visualization, Image recognition, Image coding, Correlation, Text recognition, Fuses, Semantics BibRef

Mishra, A.[Aakansha], Anand, A.[Ashish], Guha, P.[Prithwijit],
Multi-stage Attention based Visual Question Answering,
ICPR21(9407-9414)
IEEE DOI 2105
Visualization, Analytical models, Bidirectional control, Benchmark testing, Knowledge discovery, Pattern recognition, Attention Network BibRef

Bozinis, T.[Theodoros], Passalis, N.[Nikolaos], Tefas, A.[Anastasios],
Improving Visual Question Answering using Active Perception on Static Images,
ICPR21(879-884)
IEEE DOI 2105
Deep learning, Visualization, Analytical models, Image resolution, Active perception, Reinforcement learning, Knowledge discovery BibRef

Huang, H.T.[Han-Tao], Han, T.[Tao], Han, W.[Wei], Yap, D.[Deep], Chiang, C.M.[Cheng-Ming],
Answer-checking in Context: A Multi-modal Fully Attention Network for Visual Question Answering,
ICPR21(1173-1180)
IEEE DOI 2105
Visualization, Bit error rate, Image representation, Knowledge discovery, Pattern recognition BibRef

Sun, Q.[Qiang], Xie, B.H.[Bing-Hui], Fu, Y.W.[Yan-Wei],
Second Order Enhanced Multi-Glimpse Attention in Visual Question Answering,
ACCV20(IV:87-103).
Springer DOI 2103
BibRef

Goel, V.[Vatsal], Chandak, M.[Mohit], Anand, A.[Ashish], Guha, P.[Prithwijit],
IQ-VQA: Intelligent Visual Question Answering,
VTIUR20(357-370).
Springer DOI 2103
BibRef

Qiao, Y., Yu, Z., Liu, J.,
VC-VQA: Visual Calibration Mechanism For Visual Question Answering,
ICIP20(1481-1485)
IEEE DOI 2011
Visualization, Image reconstruction, Calibration, Task analysis, Predictive models, Feature extraction, Knowledge discovery, Feature Reconstruction BibRef

Tang, R.X.[Rui-Xue], Ma, C.[Chao], Zhang, W.E.[Wei Emma], Wu, Q.[Qi], Yang, X.K.[Xiao-Kang],
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering,
ECCV20(XIX:437-453).
Springer DOI 2011
BibRef

Gokhale, T.[Tejas], Banerjee, P.[Pratyay], Baral, C.[Chitta], Yang, Y.Z.[Ye-Zhou],
VQA-LOL: Visual Question Answering Under the Lens of Logic,
ECCV20(XXI:379-396).
Springer DOI 2011
BibRef

Yang, X.F.[Xiao-Feng], Lin, G.S.[Guo-Sheng], Lv, F.M.[Feng-Mao], Liu, F.Y.[Fa-Yao],
TRRNET: Tiered Relation Reasoning for Compositional Visual Question Answering,
ECCV20(XXI:414-430).
Springer DOI 2011
BibRef

Bansal, A.[Ankan], Zhang, Y.[Yuting], Chellappa, R.[Rama],
Visual Question Answering on Image Sets,
ECCV20(XXI:51-67).
Springer DOI 2011
BibRef

Han, X.Z.[Xin-Zhe], Wang, S.H.[Shu-Hui], Su, C.[Chi], Zhang, W.G.[Wei-Gang], Huang, Q.M.[Qing-Ming], Tian, Q.[Qi],
Interpretable Visual Reasoning via Probabilistic Formulation Under Natural Supervision,
ECCV20(IX:553-570).
Springer DOI 2011
BibRef

Kant, Y.[Yash], Batra, D.[Dhruv], Anderson, P.[Peter], Schwing, A.[Alexander], Parikh, D.[Devi], Lu, J.[Jiasen], Agrawal, H.[Harsh],
Spatially Aware Multimodal Transformers for TextVQA,
ECCV20(IX:715-732).
Springer DOI 2011
BibRef

Li, Q.[Qing], Huang, S.Y.[Si-Yuan], Hong, Y.[Yining], Zhu, S.C.[Song-Chun],
A Competence-aware Curriculum for Visual Concepts Learning via Question Answering,
ECCV20(II:141-157).
Springer DOI 2011
BibRef

Bajaj, G., Bandyopadhyay, B., Schmidt, D., Maneriker, P., Myers, C., Parthasarathy, S.,
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing,
MVM20(1563-1566)
IEEE DOI 2008
Cognition, Training, Task analysis, Artificial intelligence, Global communication, Taxonomy, Semantics BibRef

Vatashsky, B., Ullman, S.,
VQA With No Questions-Answers Training,
CVPR20(10373-10383)
IEEE DOI 2008
Visualization, Training, Image color analysis, Knowledge discovery, Boats, Image analysis, Task analysis BibRef

Jiang, H., Misra, I., Rohrbach, M., Learned-Miller, E.G., Chen, X.,
In Defense of Grid Features for Visual Question Answering,
CVPR20(10264-10273)
IEEE DOI 2008
Feature extraction, Visualization, Task analysis, Detectors, Object detection, Training, Pipelines BibRef

Wang, X., Liu, Y., Shen, C., Ng, C.C., Luo, C., Jin, L., Chan, C.S., van den Hengel, A., Wang, L.,
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering,
CVPR20(10123-10132)
IEEE DOI 2008
Measurement, Cognition, Knowledge discovery, Correlation, Task analysis, Visualization, Optical character recognition software BibRef

Xiong, P., Wu, Y.,
TA-Student VQA: Multi-Agents Training by Self-Questioning,
CVPR20(10062-10072)
IEEE DOI 2008
Visualization, Training, Knowledge discovery, Standards, Task analysis, Boosting BibRef

Agarwal, V., Shetty, R., Fritz, M.,
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing,
CVPR20(9687-9695)
IEEE DOI 2008
Data models, Robustness, Predictive models, Semantics, Correlation, Vocabulary, Visualization BibRef

Hu, R., Singh, A., Darrell, T.J., Rohrbach, M.,
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA,
CVPR20(9989-9999)
IEEE DOI 2008
Optical character recognition software, Task analysis, Feature extraction, Visualization, Iterative decoding, Vocabulary, Predictive models BibRef

Kafle, K., Shrestha, R., Price, B., Cohen, S., Kanan, C.,
Answering Questions about Data Visualizations using Efficient Bimodal Fusion,
WACV20(1487-1496)
IEEE DOI 2006
Bars, Data visualization, Image color analysis, Visualization, Task analysis, Optical character recognition software, Training BibRef

Patro, B.N., Patel, S., Namboodiri, V.P.,
Robust Explanations for Visual Question Answering,
WACV20(1566-1575)
IEEE DOI 2006
Visualization, Robustness, Perturbation methods, Knowledge discovery, Collaboration, Task analysis, Coherence BibRef

Chou, S., Chao, W., Lai, W., Sun, M., Yang, M.,
Visual Question Answering on 360° Images,
WACV20(1596-1605)
IEEE DOI 2006
Visualization, Task analysis, Feature extraction, Distortion, Cognition, Image color analysis, Spatial resolution BibRef

Chaudhry, R., Shekhar, S., Gupta, U., Maneriker, P., Bansal, P., Joshi, A.,
LEAF-QA: Locate, Encode Attend for Figure Question Answering,
WACV20(3501-3510)
IEEE DOI 2006
Bars, Knowledge discovery, Image color analysis, Training, Vocabulary, Data mining, Data visualization BibRef

Liang, Y.Z.[Yuan-Zhi], Bai, Y.L.[Ya-Long], Zhang, W.[Wei], Qian, X.M.[Xue-Ming], Zhu, L.[Li], Mei, T.[Tao],
VrR-VG: Refocusing Visually-Relevant Relationships,
ICCV19(10402-10411)
IEEE DOI 2004
bioinformatics, data mining, data visualisation, feature extraction, genomics, graph theory, image annotation, Cognition BibRef

Bhattacharya, N., Li, Q., Gurari, D.,
Why Does a Visual Question Have Different Answers?,
ICCV19(4270-4279)
IEEE DOI 2004
Code, Visual Q-A.
WWW Link. question answering (information retrieval), visual question answering, Visualization, Powders, Task analysis, Computer vision BibRef

Li, L.J.[Lin-Jie], Gan, Z.[Zhe], Cheng, Y.[Yu], Liu, J.J.[Jing-Jing],
Relation-Aware Graph Attention Network for Visual Question Answering,
ICCV19(10312-10321)
IEEE DOI 2004
data visualisation, graph theory, learning (artificial intelligence), object detection, Computational modeling BibRef

Peng, G.[Gao], You, H.X.[Hao-Xuan], Zhang, Z.P.[Zhan-Peng], Wang, X.G.[Xiao-Gang], Li, H.S.[Hong-Sheng],
Multi-Modality Latent Interaction Network for Visual Question Answering,
ICCV19(5824-5834)
IEEE DOI 2004
data visualisation, image representation, image retrieval, learning (artificial intelligence), Object detection BibRef

Do, T., Tran, H., Do, T., Tjiputra, E., Tran, Q.,
Compact Trilinear Interaction for Visual Question Answering,
ICCV19(392-401)
IEEE DOI 2004
learning (artificial intelligence), matrix decomposition, Correlation BibRef

Schwartz, I.[Idan], Yu, S.[Seunghak], Hazan, T.[Tamir], Schwing, A.G.[Alexander G.],
Factor Graph Attention,
CVPR19(2039-2048).
IEEE DOI 2002
BibRef

Kolesnikov, A.[Alexander], Beyer, L.[Lucas], Zhai, X.H.[Xiao-Hua], Puigcerver, J.[Joan], Yung, J.[Jessica], Gelly, S.[Sylvain], Houlsby, N.[Neil],
Big Transfer (BIT): General Visual Representation Learning,
ECCV20(V:491-507).
Springer DOI 2011
BibRef

Kolesnikov, A.[Alexander], Zhai, X.H.[Xiao-Hua], Beyer, L.[Lucas],
Revisiting Self-Supervised Visual Representation Learning,
CVPR19(1920-1929).
IEEE DOI 2002
BibRef

Xiong, P.X.[Pei-Xi], Zhan, H.Y.[Hua-Yi], Wang, X.[Xin], Sinha, B.[Baivab], Wu, Y.[Ying],
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning,
CVPR19(8349-8358).
IEEE DOI 2002
BibRef

Singh, A.[Amanpreet], Natarajan, V.[Vivek], Shah, M.[Meet], Jiang, Y.[Yu], Chen, X.L.[Xin-Lei], Batra, D.[Dhruv], Parikh, D.[Devi], Rohrbach, M.[Marcus],
Towards VQA Models That Can Read,
CVPR19(8309-8318).
IEEE DOI 2002
BibRef

Manjunatha, V.[Varun], Saini, N.[Nirat], Davis, L.S.[Larry S.],
Explicit Bias Discovery in Visual Question Answering Models,
CVPR19(9554-9563).
IEEE DOI 2002
BibRef

Shrestha, R.[Robik], Kafle, K.[Kushal], Kanan, C.[Christopher],
Answer Them All! Toward Universal Visual Question Answering Models,
CVPR19(10464-10473).
IEEE DOI 2002
BibRef

Noh, H.[Hyeonwoo], Kim, T.[Taehoon], Mun, J.[Jonghwan], Han, B.H.[Bo-Hyung],
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering,
CVPR19(8377-8386).
IEEE DOI 2002
BibRef

Wijmans, E.[Erik], Datta, S.[Samyak], Maksymets, O.[Oleksandr], Das, A.[Abhishek], Gkioxari, G.[Georgia], Lee, S.[Stefan], Essa, I.[Irfan], Parikh, D.[Devi], Batra, D.[Dhruv],
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception,
CVPR19(6652-6661).
IEEE DOI 2002
BibRef

Shah, M.[Meet], Chen, X.L.[Xin-Lei], Rohrbach, M.[Marcus], Parikh, D.[Devi],
Cycle-Consistency for Robust Visual Question Answering,
CVPR19(6642-6651).
IEEE DOI 2002
BibRef

Li, H.[Hui], Wang, P.[Peng], Shen, C.H.[Chun-Hua], van den Hengel, A.[Anton],
Visual Question Answering as Reading Comprehension,
CVPR19(6312-6321).
IEEE DOI 2002
BibRef

Yu, L.C.[Li-Cheng], Chen, X.L.[Xin-Lei], Gkioxari, G.[Georgia], Bansal, M.[Mohit], Berg, T.L.[Tamara L.], Batra, D.[Dhruv],
Multi-Target Embodied Question Answering,
CVPR19(6302-6311).
IEEE DOI 2002
BibRef

Yu, Z.[Zhou], Yu, J.[Jun], Cui, Y.H.[Yu-Hao], Tao, D.C.[Da-Cheng], Tian, Q.[Qi],
Deep Modular Co-Attention Networks for Visual Question Answering,
CVPR19(6274-6283).
IEEE DOI 2002
BibRef

Abbasnejad, E.[Ehsan], Wu, Q.[Qi], Shi, Q.F.[Qin-Feng], van den Hengel, A.[Anton],
What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions,
CVPR19(4150-4159).
IEEE DOI 2002
BibRef

Schwenk, D.[Dustin], Khandelwal, A.[Apoorv], Clark, C.[Christopher], Marino, K.[Kenneth], Mottaghi, R.[Roozbeh],
A-OKVQA: A Benchmark for Visual Question Answering Using World Knowledge,
ECCV22(VIII:146-162).
Springer DOI 2211
BibRef

Marino, K.[Kenneth], Rastegari, M.[Mohammad], Farhadi, A.[Ali], Mottaghi, R.[Roozbeh],
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge,
CVPR19(3190-3199).
IEEE DOI 2002
BibRef

Krishna, R.[Ranjay], Bernstein, M.[Michael], Fei-Fei, L.[Li],
Information Maximizing Visual Question Generation,
CVPR19(2008-2018).
IEEE DOI 2002
BibRef

Cadene, R.[Remi], Ben-younes, H.[Hedi], Cord, M.[Matthieu], Thome, N.[Nicolas],
MUREL: Multimodal Relational Reasoning for Visual Question Answering,
CVPR19(1989-1998).
IEEE DOI 2002
BibRef

Haurilet, M.[Monica], Roitberg, A.[Alina], Stiefelhagen, R.[Rainer],
It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning,
CVPR19(1930-1939).
IEEE DOI 2002
BibRef

Qiu, Y., Satoh, Y., Suzuki, R., Kataoka, H.,
Incorporating 3D Information Into Visual Question Answering,
3DV19(756-765)
IEEE DOI 1911
Feature extraction, Task analysis, Visualization, Natural language processing, Cognition, Human computer interaction BibRef

Haurilet, M.[Monica], Al-Halah, Z.[Ziad], Stiefelhagen, R.[Rainer],
DynGraph: Visual Question Answering via Dynamic Scene Graphs,
GCPR19(428-441).
Springer DOI 1911
BibRef
Earlier:
MoQA: A Multi-modal Question Answering Architecture,
VL18(IV:106-113).
Springer DOI 1905
BibRef

Liu, F., Liu, J., Fang, Z., Lu, H.,
Language and Visual Relations Encoding for Visual Question Answering,
ICIP19(3307-3311)
IEEE DOI 1910
Visual question answering, Relations, Attention BibRef

Fang, Z.W.[Zhi-Wei], Liu, J.[Jing], Tang, Q.[Qu], Li, Y.[Yong], Lu, H.Q.[Han-Qing],
Answer Distillation for Visual Question Answering,
ACCV18(I:72-87).
Springer DOI 1906
BibRef

Kuhnle, A.[Alexander], Xie, H.Y.[Hui-Yuan], Copestake, A.[Ann],
How Clever Is the FiLM Model, and How Clever Can it Be?,
VL18(IV:162-172).
Springer DOI 1905
BibRef

Li, W.[Wei], Yuan, Z.H.[Ze-Huan], Fang, X.Z.[Xiang-Zhong], Wang, C.[Changhu],
Knowing Where to Look? Analysis on Attention of Visual Question Answering System,
VL18(IV:145-152).
Springer DOI 1905
BibRef

Wagner, M.[Misha], Basevi, H.[Hector], Shetty, R.[Rakshith], Li, W.B.[Wen-Bin], Malinowski, M.[Mateusz], Fritz, M.[Mario], Leonardis, A.[Aleš],
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions,
VLEASE18(I:521-537).
Springer DOI 1905
BibRef

Duke, B., Taylor, G.W.,
Generalized Hadamard-Product Fusion Operators for Visual Question Answering,
CRV18(39-46)
IEEE DOI 1812
Feature extraction, Visualization, Task analysis, Data models, Mathematical model, Natural languages, Model Selection, Visual Question-Answering BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
CVPR18(1-10)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Misra, I., Girshick, R., Fergus, R., Hebert, M., Gupta, A., van der Maaten, L.[Laurens],
Learning by Asking Questions,
CVPR18(11-20)
IEEE DOI 1812
Training, Proposals, Visualization, Knowledge discovery, Standards, Task analysis, Data models BibRef

Gurari, D., Li, Q., Stangl, A.J., Guo, A., Lin, C., Grauman, K., Luo, J., Bigham, J.P.,
VizWiz Grand Challenge: Answering Visual Questions from Blind People,
CVPR18(3608-3617)
IEEE DOI 1812
Visualization, Blindness, Prediction algorithms, Lighting, Mobile handsets, Shape BibRef

Li, J., Su, H., Zhu, J., Wang, S., Zhang, B.,
Textbook Question Answering Under Instructor Guidance with Memory Networks,
CVPR18(3655-3663)
IEEE DOI 1812
Task analysis, Cognition, Visualization, Feature extraction, Semantics, Knowledge discovery, Drugs BibRef

Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.,
IQA: Visual Question Answering in Interactive Environments,
CVPR18(4089-4098)
IEEE DOI 1812
Task analysis, Navigation, Visualization, Knowledge discovery, Semantics, Planning BibRef

Agrawal, A., Batra, D., Parikh, D., Kembhavi, A.,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering,
CVPR18(4971-4980)
IEEE DOI 1812
Image color analysis, Visualization, Data models, Training data, Training, Knowledge discovery, Dogs BibRef

Sha, F., Chao, W., Hu, H.,
Learning Answer Embeddings for Visual Question Answering,
CVPR18(5428-5436)
IEEE DOI 1812
Visualization, Semantics, Probabilistic logic, Computational modeling, Task analysis, Training, Adaptation models BibRef

Kafle, K., Price, B., Cohen, S., Kanan, C.,
DVQA: Understanding Data Visualizations via Question Answering,
CVPR18(5648-5656)
IEEE DOI 1812
Bars, Cognition, Image color analysis, Visualization, Data visualization, Data mining, Knowledge discovery BibRef

Sha, F., Hu, H., Chao, W.,
Cross-Dataset Adaptation for Visual Question Answering,
CVPR18(5716-5725)
IEEE DOI 1812
Visualization, Task analysis, Adaptation models, Knowledge discovery, Games, Training, Target recognition BibRef

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.,
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering,
CVPR18(6077-6086)
IEEE DOI 1812
Visualization, Task analysis, Proposals, Mathematical model, Servers, Context modeling, Object detection BibRef

Nguyen, D., Okatani, T.,
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering,
CVPR18(6087-6096)
IEEE DOI 1812
Feature extraction, Visualization, Fuses, Knowledge discovery, Bidirectional control BibRef

Patro, B., Namboodiri, V.P.,
Differential Attention for Visual Question Answering,
CVPR18(7680-7688)
IEEE DOI 1812
Semantics, Task analysis, Visualization, Knowledge discovery, Correlation, Measurement, Training BibRef

Su, Z.[Zhou], Zhu, C.[Chen], Dong, Y.P.[Yin-Peng], Cai, D.Q.[Dong-Qi], Chen, Y.R.[Yu-Rong], Li, J.G.[Jian-Guo],
Learning Visual Knowledge Memory Networks for Visual Question Answering,
CVPR18(7736-7745)
IEEE DOI 1812
Visualization, Knowledge based systems, Task analysis, Knowledge discovery, Cognition, Ovens BibRef

Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.,
Embodied Question Answering,
DeepLearnRV18(2135-213509)
IEEE DOI 1812
Navigation, Visualization, Task analysis, Automobiles, Knowledge discovery BibRef

Cheng, W., Huang, Y., Wang, L.,
Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method,
ICPR18(3303-3308)
IEEE DOI 1812
Visualization, Task analysis, Feature extraction, Training, Knowledge discovery, Proposals, Semantics BibRef

Zhou, B.[Bolei], Sun, Y.[Yiyou], Bau, D.[David], Torralba, A.B.[Antonio B.],
Interpretable Basis Decomposition for Visual Explanation,
ECCV18(VIII: 122-138).
Springer DOI 1810
BibRef

Shi, Y.[Yang], Furlanello, T.[Tommaso], Zha, S.[Sheng], Anandkumar, A.[Animashree],
Question Type Guided Attention in Visual Question Answering,
ECCV18(II: 158-175).
Springer DOI 1810
BibRef

Narasimhan, M.[Medhini], Schwing, A.G.[Alexander G.],
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering,
ECCV18(VIII: 460-477).
Springer DOI 1810
BibRef

Malinowski, M.[Mateusz], Doersch, C.[Carl], Santoro, A.[Adam], Battaglia, P.[Peter],
Learning Visual Question Answering by Bootstrapping Hard Attention,
ECCV18(VI: 3-20).
Springer DOI 1810
BibRef

Gu, J.X.[Jiu-Xiang], Cai, J.F.[Jian-Fei], Joty, S.[Shafiq], Niu, L.[Li], Wang, G.[Gang],
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models,
CVPR18(7181-7189)
IEEE DOI 1812
Visualization, Training, Decoding, Semantics, Measurement. BibRef

Li, Q.[Qing], Tao, Q.Y.[Qing-Yi], Joty, S.[Shafiq], Cai, J.F.[Jian-Fei], Luo, J.B.[Jie-Bo],
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions,
ECCV18(VII: 570-586).
Springer DOI 1810
BibRef

Yu, D., Gao, X., Xiong, H.,
Structured Semantic Representation for Visual Question Answering,
ICIP18(2286-2290)
IEEE DOI 1809
Semantics, Training, Cognition, Visualization, Task analysis, Linguistics, Computational modeling, Visual question answering BibRef

Huang, L., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., Turaga, P.K.,
CS-VQA: Visual Question Answering with Compressively Sensed Images,
ICIP18(1283-1287)
IEEE DOI 1809
Visualization, Image reconstruction, Image coding, Task analysis, Feature extraction, Training, Multiplexing, image reconstruction BibRef

Desta, M.T., Chen, L., Kornuta, T.,
Object-Based Reasoning in VQA,
WACV18(1814-1823)
IEEE DOI 1806
data visualisation, inference mechanisms, natural language processing, object detection, Visualization BibRef

Zhao, H., Fan, Q., Gutfreund, D., Fu, Y.,
Semantically Guided Visual Question Answering,
WACV18(1852-1860)
IEEE DOI 1806
data visualisation, image colour analysis, image representation, learning (artificial intelligence), Visualization BibRef

Wang, Z., Liu, X., Wang, L., Qiao, Y., Xie, X., Fowlkes, C.C.[Charless C.],
Structured Triplet Learning with POS-Tag Guided Attention for Visual Question Answering,
WACV18(1888-1896)
IEEE DOI 1806
convolution, data visualisation, learning (artificial intelligence), Visualization BibRef

Chowdhury, I., Nguyen, K., Fookes, C., Sridharan, S.,
A cascaded long short-term memory (LSTM) driven generic visual question answering (VQA),
ICIP17(1842-1846)
IEEE DOI 1803
Feature extraction, Mathematical model, Natural languages, Principal component analysis, Task analysis, Training, scene understanding BibRef

Sheng, S.[Shurong], Venkitasubramanian, A.N.[Aparna Nurani], Moens, M.F.[Marie-Francine],
A Markov Network Based Passage Retrieval Method for Multimodal Question Answering in the Cultural Heritage Domain,
MMMod18(I:3-15).
Springer DOI 1802
BibRef

Yu, Z., Yu, J., Fan, J., Tao, D.,
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering,
ICCV17(1839-1848)
IEEE DOI 1802
computational complexity, feature extraction, image fusion, learning (artificial intelligence), Visualization BibRef

Ben-younes, H., Cadene, R., Cord, M., Thome, N.,
MUTAN: Multimodal Tucker Fusion for Visual Question Answering,
ICCV17(2631-2639)
IEEE DOI 1802
image fusion, image representation, question answering (information retrieval), tensors, (VQA) tasks, Visualization BibRef

Jain, U.[Unnat], Zhang, Z.Y.[Zi-Yu], Schwing, A.[Alexander],
Creativity: Generating Diverse Questions Using Variational Autoencoders,
CVPR17(5415-5424)
IEEE DOI 1711
Artificial intelligence, Creativity, Hidden Markov models, Training, Transforms, Visualization BibRef

Zhu, Y., Lim, J.J., Fei-Fei, L.[Li],
Knowledge Acquisition for Visual Question Answering via Iterative Querying,
CVPR17(6146-6155)
IEEE DOI 1711
Computational modeling, Data models, Generators, Knowledge discovery, Standards, Visualization BibRef

Lin, Y.T.[Yue-Tan], Pang, Z.Y.[Zhang-Yang], Li, Y.[Yanan], Wang, D.H.[Dong-Hui],
Simple and effective visual question answering in a single modality,
ICIP16(2276-2280)
IEEE DOI 1610
Benchmark testing. Not just add text to image questions. BibRef

Kembhavi, A., Seo, M., Schwenk, D., Choi, J., Farhadi, A., Hajishirzi, H.,
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension,
CVPR17(5376-5384)
IEEE DOI 1711
Cognition, Knowledge discovery, Natural languages, Training, Visualization BibRef

Ganju, S., Russakovsky, O., Gupta, A.,
What's in a Question: Using Visual Questions as a Form of Supervision,
CVPR17(6422-6431)
IEEE DOI 1711
Artificial intelligence, Computational modeling, Dogs, Image color analysis, SPICE, Visualization BibRef

Xu, H.J.[Hui-Juan], Saenko, K.[Kate],
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering,
ECCV16(VII: 451-466).
Springer DOI 1611
Visual Question Answering. BibRef

Jabri, A.[Allan], Joulin, A.[Armand], van der Maaten, L.[Laurens],
Revisiting Visual Question Answering Baselines,
ECCV16(VIII: 727-739).
Springer DOI 1611
BibRef

Yang, Z.C.[Zi-Chao], He, X.D.[Xiao-Dong], Gao, J.F.[Jian-Feng], Deng, L.[Li], Smola, A.[Alex],
Stacked Attention Networks for Image Question Answering,
CVPR16(21-29)
IEEE DOI 1612
BibRef

Sadeghi, F.[Fereshteh], Divvala, S.K.[Santosh K.], Farhadi, A.[Ali],
VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases,
CVPR15(1456-1464)
IEEE DOI 1510
Visual verification of text relationships. BibRef

Liu, Y.[Yang], Liu, J.[Jie], Wang, D.[Dong], Cheng, J.[Jian],
A robust multivariate reranking algorithm for Question Answering enrichment,
ICIP12(1917-1920).
IEEE DOI 1302
BibRef

Varekamp, C.[Chris], van de Walle, P.[Patrick], de Putter, M.[Marc],
Question interface for 3D picture creation on an autostereoscopic digital picture frame,
3DTV09(1-4).
IEEE DOI 0905
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
VQA, Visual Question Answering, Neural Networks .


Last update:Sep 28, 2024 at 17:47:54