Wu, Y.C.[Yu-Chieh],
Yang, J.C.[Jie-Chi],
A Robust Passage Retrieval Algorithm for Video Question Answering,
CirSysVideo(18), No. 10, October 2008, pp. 1411-1421.
IEEE DOI
0811
BibRef
Wu, Y.C.[Yu-Chieh],
Lee, Y.S.[Yue-Shi],
Yang, J.C.[Jie-Chi],
Yen, S.J.[Show-Jane],
A New Passage Ranking Algorithm for Video Question Answering,
PSIVT06(563-572).
Springer DOI
0612
BibRef
Li, G.D.[Guang-Da],
Li, H.J.[Hao-Jie],
Ming, Z.Y.[Zhao-Yan],
Hong, R.C.[Ri-Chang],
Tang, S.[Sheng],
Chua, T.S.[Tat-Seng],
Question Answering over Community-Contributed Web Videos,
MultMedMag(17), No. 4, October-December 2010, pp. 46-57.
IEEE DOI
1011
BibRef
Song, Y.C.[Yi-Cheng],
Li, H.J.[Hao-Jie],
Mash-Up Approach for Web Video Category Recommendation,
PSIVT10(197-202).
IEEE DOI
1011
BibRef
Guo, Z.Y.[Zhao-Yu],
Zhao, Z.[Zhou],
Jin, W.[Weike],
Wei, Z.C.[Zhi-Cheng],
Yang, M.[Min],
Wang, N.N.[Nan-Nan],
Yuan, N.J.[Nicholas Jing],
Multi-Turn Video Question Generation via Reinforced Multi-Choice
Attention Network,
CirSysVideo(31), No. 5, 2021, pp. 1697-1710.
IEEE DOI
2105
BibRef
Xue, H.Y.[Hong-Yang],
Chu, W.,
Zhao, Z.[Zhou],
Cai, D.[Deng],
A Better Way to Attend: Attention With Trees for Video Question
Answering,
IP(27), No. 11, November 2018, pp. 5563-5574.
IEEE DOI
1809
computational linguistics, feature extraction, grammars,
natural language processing, scene understanding
BibRef
Xue, H.Y.[Hong-Yang],
Zhao, Z.[Zhou],
Cai, D.[Deng],
Unifying the Video and Question Attentions for Open-Ended Video
Question Answering,
IP(26), No. 12, December 2017, pp. 5656-5666.
IEEE DOI
1710
image retrieval, video coding,
temporal question attention, temporal structures,
Adaptation models, Coherence, Hair, Knowledge discovery,
BibRef
Zhao, Z.[Zhou],
Xiao, S.W.[Shu-Wen],
Song, Z.[Zehan],
Lu, C.J.[Chu-Jie],
Xiao, J.[Jun],
Zhuang, Y.T.[Yue-Ting],
Open-Ended Video Question Answering via Multi-Modal Conditional
Adversarial Networks,
IP(29), 2020, pp. 3859-3870.
IEEE DOI
2002
Open-ended video question answering, multi-modal neural network
BibRef
Zhao, Z.[Zhou],
Zhang, Z.[Zhu],
Xiao, S.W.[Shu-Wen],
Xiao, Z.X.[Zhen-Xin],
Yan, X.H.[Xiao-Hui],
Yu, J.[Jun],
Cai, D.[Deng],
Wu, F.[Fei],
Long-Form Video Question Answering via Dynamic Hierarchical
Reinforced Networks,
IP(28), No. 12, December 2019, pp. 5939-5952.
IEEE DOI
1909
Knowledge discovery, Semantics, Visualization, Natural languages,
Road transportation, Task analysis, Decoding,
reinforcement learning
BibRef
Yu, T.[Ting],
Yu, J.[Jun],
Yu, Z.[Zhou],
Huang, Q.M.[Qing-Ming],
Tian, Q.[Qi],
Long-Term Video Question Answering via Multimodal Hierarchical Memory
Attentive Networks,
CirSysVideo(31), No. 3, March 2021, pp. 931-944.
IEEE DOI
2103
Knowledge discovery, Cognition, Visualization, Task analysis,
Semantics, Engines, Computational modeling, Long-term,
in-depth reasoning
BibRef
Jang, Y.[Yunseok],
Song, Y.[Yale],
Kim, C.D.[Chris Dongjoo],
Yu, Y.[Youngjae],
Kim, Y.[Youngjin],
Kim, G.[Gunhee],
Video Question Answering with Spatio-Temporal Reasoning,
IJCV(127), No. 10, October 2019, pp. 1385-1412.
Springer DOI
1909
BibRef
Earlier: A1, A2, A4, A5, A6, Only:
TGIF-QA:
Toward Spatio-Temporal Reasoning in Visual Question Answering,
CVPR17(1359-1367)
IEEE DOI
1711
Cognition, Crowdsourcing, Image color analysis,
Knowledge discovery, Motion pictures, Visualization
BibRef
Yu, T.,
Yu, J.,
Yu, Z.,
Tao, D.,
Compositional Attention Networks With Two-Stream Fusion for Video
Question Answering,
IP(29), No. , 2020, pp. 1204-1218.
IEEE DOI
1911
Visualization, Streaming media, Knowledge discovery,
Feature extraction, Proposals, Task analysis, Semantics,
action pooling stream
BibRef
Wang, W.N.[Wei-Ning],
Huang, Y.[Yan],
Wang, L.[Liang],
Long video question answering: A Matching-guided Attention Model,
PR(102), 2020, pp. 107248.
Elsevier DOI
2003
Long video QA, Matching-guided attention
BibRef
Zhang, W.,
Tang, S.,
Cao, Y.,
Pu, S.,
Wu, F.,
Zhuang, Y.,
Frame Augmented Alternating Attention Network for Video Question
Answering,
MultMed(22), No. 4, April 2020, pp. 1032-1041.
IEEE DOI
2004
Feature extraction, Visualization, Knowledge discovery,
Task analysis, Data mining, Neural networks, Semantics, Video QA,
neural network
BibRef
Chen, J.[Jie],
Shao, J.[Jie],
He, C.[Chengkun],
Movie fill in the blank by joint learning from video and text with
adaptive temporal attention,
PRL(132), 2020, pp. 62-68.
Elsevier DOI
2005
Video question answering, Adaptive temporal attention, Text information fusion
BibRef
Wang, A.,
Luu, A.T.,
Foo, C.,
Zhu, H.,
Tay, Y.,
Chandrasekhar, V.,
Holistic Multi-Modal Memory Network for Movie Question Answering,
IP(29), No. 1, 2020, pp. 489-499.
IEEE DOI
1910
question answering (information retrieval),
holistic multimodal memory network, multimodal context,
MovieQA
BibRef
Yuan, Z.Q.[Zhao-Quan],
Sun, S.Y.[Si-Yuan],
Duan, L.X.[Li-Xin],
Li, C.S.[Chang-Sheng],
Wu, X.[Xiao],
Xu, C.S.[Chang-Sheng],
Adversarial Multimodal Network for Movie Story Question Answering,
MultMed(23), 2021, pp. 1744-1756.
IEEE DOI
2106
Knowledge discovery, Motion pictures, Visualization, Task analysis,
Generators, Natural languages,
multimodal understanding
BibRef
Gu, M.,
Zhao, Z.,
Jin, W.,
Hong, R.,
Wu, F.,
Graph-Based Multi-Interaction Network for Video Question Answering,
IP(30), 2021, pp. 2758-2770.
IEEE DOI
2102
Visualization, Knowledge discovery, Cats, Semantics, Task analysis,
Image segmentation, Adaptation models, Video question answering,
graph-based relation-aware neural network
BibRef
Xie, Z.[Zhao],
Wu, K.W.[Ke-Wei],
Zhang, X.Y.[Xiao-Yu],
Yang, X.M.[Xing-Ming],
Hou, J.K.[Jin-Kui],
Learning continuous temporal embedding of videos using pattern theory,
PRL(146), 2021, pp. 222-229.
Elsevier DOI
2105
Action Recognition, Continuous Temporal Embedding, Pattern Theory, CNN, LSTM
BibRef
Liu, Y.[Yun],
Zhang, X.M.[Xiao-Ming],
Zhang, Q.Y.[Qian-Yun],
Li, C.Z.[Chao-Zhuo],
Huang, F.[Feiran],
Tang, X.H.[Xiang-Hong],
Li, Z.J.[Zhou-Jun],
Dual self-attention with co-attention networks for visual question
answering,
PR(117), 2021, pp. 107956.
Elsevier DOI
2106
Self-attention, Visual-textual co-attention, Visual question answering
BibRef
Liu, Y.[Yun],
Zhang, X.M.[Xiao-Ming],
Huang, F.[Feiran],
Shen, S.X.[Shi-Xun],
Tian, P.[Peng],
Li, L.[Lang],
Li, Z.J.[Zhou-Jun],
Dynamic Self-Attention with Vision Synchronization Networks for Video
Question Answering,
PR(132), 2022, pp. 108959.
Elsevier DOI
2209
Video question answering, Dynamic self-attention, Vision synchronization
BibRef
Liu, Y.[Yun],
Zhang, X.M.[Xiao-Ming],
Huang, F.[Feiran],
Zhang, B.[Bo],
Li, Z.J.[Zhou-Jun],
Cross-Attentional Spatio-Temporal Semantic Graph Networks for Video
Question Answering,
IP(31), 2022, pp. 1684-1696.
IEEE DOI
2202
Semantics, Correlation, Cognition, Visualization,
Knowledge discovery, Task analysis, Head, Video question answering,
inter- and intra-modality correlations
BibRef
Jin, W.[Weike],
Zhao, Z.[Zhou],
Cao, X.C.[Xiao-Chun],
Zhu, J.M.[Jie-Ming],
He, X.Q.[Xiu-Qiang],
Zhuang, Y.T.[Yue-Ting],
Adaptive Spatio-Temporal Graph Enhanced Vision-Language
Representation for Video QA,
IP(30), 2021, pp. 5477-5489.
IEEE DOI
2106
Visualization, Task analysis, Adaptation models, Bit error rate,
Knowledge discovery, Cognition, Training,
video question answering
BibRef
Gao, L.[Lianli],
Chen, T.M.[Tang-Ming],
Li, X.P.[Xiang-Peng],
Zeng, P.P.[Peng-Peng],
Zhao, L.[Lei],
Li, Y.F.[Yuan-Fang],
Generalized pyramid co-attention with learnable aggregation net for
video question answering,
PR(120), 2021, pp. 108145.
Elsevier DOI
2109
Video question answering, Diversity learning,
Learnable aggregation, Cascaded pyramid transformer co-attention
BibRef
Le, T.M.[Thao Minh],
Le, V.[Vuong],
Venkatesh, S.[Svetha],
Tran, T.[Truyen],
Hierarchical Conditional Relation Networks for Multimodal Video
Question Answering,
IJCV(129), No. 11, November 2021, pp. 3027-3050.
Springer DOI
2110
BibRef
Earlier:
Hierarchical Conditional Relation Networks for Video Question
Answering,
CVPR20(9969-9978)
IEEE DOI
2008
Linguistics, Cognition, Visualization,
Context modeling, Encoding, Buildings
BibRef
Su, H.T.[Hung-Ting],
Chang, C.H.[Chen-Hsi],
Shen, P.W.[Po-Wei],
Wang, Y.S.[Yu-Siang],
Chang, Y.L.[Ya-Liang],
Chang, Y.C.[Yu-Cheng],
Cheng, P.J.[Pu-Jen],
Hsu, W.H.[Winston H.],
End-to-End Video Question-Answer Generation With Generator-Pretester
Network,
CirSysVideo(31), No. 11, November 2021, pp. 4497-4507.
IEEE DOI
2112
Training, Task analysis, Knowledge discovery, Proposals,
Streaming media, Generators, Data models, Video question answering,
pretester network
BibRef
Gao, L.L.[Lian-Li],
Lei, Y.[Yu],
Zeng, P.P.[Peng-Peng],
Song, J.K.[Jing-Kuan],
Wang, M.[Meng],
Shen, H.T.[Heng Tao],
Hierarchical Representation Network With Auxiliary Tasks for Video
Captioning and Video Question Answering,
IP(31), 2022, pp. 202-215.
IEEE DOI
2112
Task analysis, Visualization, Semantics, Knowledge discovery,
Artificial neural networks, Syntactics, Decoding, Video captioning,
auxiliary task
BibRef
Zhang, J.P.[Ji-Peng],
Shao, J.[Jie],
Cao, R.[Rui],
Gao, L.L.[Lian-Li],
Xu, X.[Xing],
Shen, H.T.[Heng Tao],
Action-Centric Relation Transformer Network for Video Question
Answering,
CirSysVideo(32), No. 1, January 2022, pp. 63-74.
IEEE DOI
2201
Feature extraction, Visualization, Cognition, Task analysis,
Knowledge discovery, Proposals, Encoding, Video question answering,
relation reasoning
BibRef
Zhang, H.[Hao],
Sun, A.[Aixin],
Jing, W.[Wei],
Zhen, L.L.[Liang-Li],
Zhou, J.T.Y.[Joey Tian-Yi],
Goh, R.S.M.[Rick Siow Mong],
Natural Language Video Localization: A Revisit in Span-Based Question
Answering Framework,
PAMI(44), No. 8, August 2022, pp. 4252-4266.
IEEE DOI
2207
Location awareness, Knowledge discovery, Task analysis, Standards,
Feature extraction, Degradation, Semantics,
cross-modal interaction
BibRef
Wang, J.Y.[Jian-Yu],
Bao, B.K.[Bing-Kun],
Xu, C.S.[Chang-Sheng],
DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question
Answering,
MultMed(24), 2022, pp. 3369-3380.
IEEE DOI
2207
Cognition, Visualization, Task analysis, Knowledge discovery,
Feature extraction, Fuses, Dogs, Video question answering, multi-modal
BibRef
Zeng, P.P.[Peng-Peng],
Zhang, H.N.[Hao-Nan],
Gao, L.[Lianli],
Song, J.K.[Jing-Kuan],
Shen, H.T.[Heng Tao],
Video Question Answering With Prior Knowledge and Object-Sensitive
Learning,
IP(31), 2022, pp. 5936-5948.
IEEE DOI
2209
Cognition, Visualization, Task analysis,
Question answering (information retrieval), Semantics, Ice, object learning
BibRef
Gan, Z.[Zhe],
Li, L.J.[Lin-Jie],
Li, C.Y.[Chun-Yuan],
Wang, L.J.[Li-Juan],
Liu, Z.C.[Zi-Cheng],
Gao, J.F.[Jian-Feng],
Vision-Language Pre-Training:
Basics, Recent Advances, and Future Trends,
FTCGV(14), No. 3-4, 2022, pp. 163-352.
DOI Link Video analysis and event recognition, Learning and statistical methods,
Object and scene recognition, Image and video retrieval
BibRef
2200
Zhang, F.[Fuwei],
Wang, R.[Ruomei],
Zhou, F.[Fan],
Luo, Y.M.[Yuan-Mao],
ERM: Energy-Based Refined-Attention Mechanism for Video Question
Answering,
CirSysVideo(33), No. 3, March 2023, pp. 1454-1467.
IEEE DOI
2303
Spatiotemporal phenomena, Visualization,
Object oriented modeling, Transformers, Task analysis, Neurons,
pseudo-related information
BibRef
Yang, J.[Jonghyeon],
Jang, H.[Hanme],
Yu, K.[Kiyun],
Analyzing Geographic Questions Using Embedding-based Topic Modeling,
IJGI(12), No. 2, 2023, pp. xx-yy.
DOI Link
2303
BibRef
Zhao, S.[Shengwei],
Liu, Y.Y.[Yu-Ying],
Du, S.[Shaoyi],
Tian, Z.Q.[Zhi-Qiang],
Qu, T.[Ting],
Xu, L.H.[Lin-Hai],
CMFG: Cross-model Fine-grained Feature Interaction for Text-video
Retrieval,
MMMod23(II: 435-445).
Springer DOI
2304
BibRef
Luo, H.N.[Hao-Nan],
Lin, G.S.[Guo-Sheng],
Yao, Y.Z.[Ya-Zhou],
Liu, F.Y.[Fa-Yao],
Liu, Z.C.[Zi-Chuan],
Tang, Z.M.[Zhen-Min],
Depth and Video Segmentation Based Visual Attention for Embodied
Question Answering,
PAMI(45), No. 6, June 2023, pp. 6807-6819.
IEEE DOI
2305
BibRef
Earlier: A1, A2, A5, A4, A6, A3:
SegEQA: Video Segmentation Based Visual Attention for Embodied
Question Answering,
ICCV19(9666-9675)
IEEE DOI
2004
Visualization, Semantics, Navigation, Task analysis,
Image segmentation, Knowledge discovery, Feature extraction, navigation.
feature extraction, image fusion, question answering (information retrieval),
feature fusion
BibRef
Zhang, M.[Mingda],
Hwa, R.[Rebecca],
Kovashka, A.[Adriana],
How to Practice VQA on a Resource-limited Target Domain,
WACV23(4440-4449)
IEEE DOI
2302
Visualization, Adaptation models, Sensitivity,
Computational modeling, Transfer learning, visual reasoning
BibRef
Lee, J.[Jihyeon],
Kang, W.[Wooyoung],
Kim, E.S.[Eun-Sol],
Dense but Efficient VideoQA for Intricate Compositional Reasoning,
WACV23(1114-1123)
IEEE DOI
2302
Representation learning, Deformable models, Visualization,
Computational modeling, Semantics, Transformers,
Vision + language and/or other modalities
BibRef
Shen, R.[Ruoyue],
Inoue, N.[Nakamasa],
Shinoda, K.[Koichi],
Text-Guided Object Detector for Multi-modal Video Question Answering,
WACV23(1032-1042)
IEEE DOI
2302
Training, Measurement, Visualization, Annotations, Semantics,
Detectors, Object detection,
Vision + language and/or other modalities
BibRef
Fang, S.[Sheng],
Wang, S.H.[Shu-Hui],
Zhuo, J.[Junbao],
Han, X.Z.[Xin-Zhe],
Huang, Q.M.[Qing-Ming],
Learning Linguistic Association Towards Efficient Text-Video Retrieval,
ECCV22(XXXVI:254-270).
Springer DOI
2211
BibRef
Xiao, J.B.[Jun-Bin],
Zhou, P.[Pan],
Chua, T.S.[Tat-Seng],
Yan, S.C.[Shui-Cheng],
Video Graph Transformer for Video Question Answering,
ECCV22(XXXVI:39-58).
Springer DOI
2211
BibRef
Piergiovanni, A.J.,
Morton, K.[Kairo],
Kuo, W.C.[Wei-Cheng],
Ryoo, M.S.[Michael S.],
Angelova, A.[Anelia],
Video Question Answering with Iterative Video-Text Co-tokenization,
ECCV22(XXXVI:76-94).
Springer DOI
2211
BibRef
Bärmann, L.[Leonard],
Waibel, A.[Alex],
Where did I leave my keys?: Episodic-Memory-Based Question Answering
on Egocentric Videos,
Ego4D-EPIC22(1559-1567)
IEEE DOI
2210
Limiting, Codes, Computational modeling,
Memory management, Question answering (information retrieval)
BibRef
Li, J.T.[Jiang-Tong],
Niu, L.[Li],
Zhang, L.Q.[Li-Qing],
From Representation to Reasoning: Towards both Evidence and
Commonsense Reasoning for Video Question-Answering,
CVPR22(21241-21250)
IEEE DOI
2210
Representation learning, Visualization, Grounding,
Benchmark testing, Distance measurement, Pattern recognition,
Visual reasoning
BibRef
Datta, S.[Samyak],
Dharur, S.[Sameer],
Cartillier, V.[Vincent],
Desai, R.[Ruta],
Khanna, M.[Mukul],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Episodic Memory Question Answering,
CVPR22(19097-19106)
IEEE DOI
2210
Visualization, Semantics, Memory management, Video sequences,
Question answering (information retrieval), Robustness,
Vision + language
BibRef
Gandhi, M.[Mona],
Gul, M.O.[Mustafa Omer],
Prakash, E.[Eva],
Grunde-McLaughlin, M.[Madeleine],
Krishna, R.[Ranjay],
Agrawala, M.[Maneesh],
Measuring Compositional Consistency for Video Question Answering,
CVPR22(5036-5045)
IEEE DOI
2210
Measurement, Visualization, Directed acyclic graph, Image analysis,
Benchmark testing, Cognition, Vision + language,
Visual reasoning
BibRef
Gorti, S.K.[Satya Krishna],
Vouitsis, N.[Noël],
Ma, J.W.[Jun-Wei],
Golestan, K.[Keyvan],
Volkovs, M.[Maksims],
Garg, A.[Animesh],
Yu, G.[Guangwei],
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval,
CVPR22(4996-5005)
IEEE DOI
2210
Visualization, Codes, Computational modeling, Benchmark testing,
Question answering (information retrieval), Cognition,
Video analysis and understanding
BibRef
Li, J.C.[Jun-Cheng],
Tang, S.L.[Si-Liang],
Zhu, L.C.[Lin-Chao],
Shi, H.[Haochen],
Huang, X.[Xuanwen],
Wu, F.[Fei],
Yang, Y.[Yi],
Zhuang, Y.T.[Yue-Ting],
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for
Video-and-Language Inference,
ICCV21(1847-1857)
IEEE DOI
2203
Visualization, Adaptation models, Adaptive systems, Semantics,
Coherence, Linguistics, Vision + language,
Vision + other modalities
BibRef
Zhang, M.X.[Ming-Xing],
Yang, Y.[Yang],
Chen, X.[Xinghan],
Ji, Y.L.[Yan-Li],
Xu, X.[Xing],
Li, J.J.[Jing-Jing],
Shen, H.T.[Heng Tao],
Multi-stage Aggregated Transformer Network for Temporal Language
Localization in Videos,
CVPR21(12664-12673)
IEEE DOI
2111
Location awareness, Visualization,
Computational modeling, Scalability, Transformers
BibRef
Kim, N.[Nayoung],
Ha, S.J.[Seong Jong],
Kang, J.W.[Je-Won],
Video Question Answering Using Language-Guided Deep Compressed-Domain
Video Feature,
ICCV21(1688-1697)
IEEE DOI
2203
Deep learning, Training, Visualization, Computational modeling,
Neural networks, Video compression, Feature extraction,
Vision + other modalities
BibRef
Liu, F.[Fei],
Liu, J.[Jing],
Wang, W.N.[Wei-Ning],
Lu, H.Q.[Han-Qing],
HAIR: Hierarchical Visual-Semantic Relational Reasoning for Video
Question Answering,
ICCV21(1678-1687)
IEEE DOI
2203
Hair, Heart, Visualization, Semantics, Benchmark testing, Cognition,
Vision + language,
BibRef
Yang, A.[Antoine],
Miech, A.[Antoine],
Sivic, J.[Josef],
Laptev, I.[Ivan],
Schmid, C.[Cordelia],
Just Ask:
Learning to Answer Questions from Millions of Narrated Videos,
ICCV21(1666-1677)
IEEE DOI
2203
Training, Visualization, Vocabulary, Annotations, Scalability, Manuals,
Transformers, Vision + language,
BibRef
Gao, D.F.[Di-Fei],
Wang, R.P.[Rui-Ping],
Bai, Z.[Ziyi],
Chen, X.L.[Xi-Lin],
Env-QA: A Video Question Answering Benchmark for Comprehensive
Understanding of Dynamic Environments,
ICCV21(1655-1665)
IEEE DOI
2203
Visualization, Layout, Feature extraction, Transformers, Cognition,
Data mining, Vision + language, Video analysis and understanding,
Visual reasoning and logical representation
BibRef
Yun, H.[Heeseung],
Yu, Y.[Youngjae],
Yang, W.[Wonsuk],
Lee, K.[Kangil],
Kim, G.[Gunhee],
Pano-AVQA: Grounded Audio-Visual Question Answering on 360° Videos,
ICCV21(2011-2021)
IEEE DOI
2203
Training, Navigation, Grounding, Semantics, Benchmark testing,
Transformers, Vision + language, Vision + other modalities
BibRef
Lei, J.[Jie],
Li, L.J.[Lin-Jie],
Zhou, L.[Luowei],
Gan, Z.[Zhe],
Berg, T.L.[Tamara L.],
Bansal, M.[Mohit],
Liu, J.J.[Jing-Jing],
Less is More:
CLIPBERT for Video-and-Language Learning via Sparse Sampling,
CVPR21(7327-7337)
IEEE DOI
2111
Training, Computational modeling, Feature extraction, Knowledge discovery,
Distance measurement
BibRef
Xu, L.[Li],
Huang, H.[He],
Liu, J.[Jun],
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient
Network for Video Reasoning over Traffic Events,
CVPR21(9873-9883)
IEEE DOI
2111
Transportation, Benchmark testing, Knowledge discovery, Cognition,
Computational efficiency, Pattern recognition, Reliability
BibRef
Park, J.[Jungin],
Lee, J.Y.[Ji-Young],
Sohn, K.H.[Kwang-Hoon],
Bridge to Answer: Structure-aware Graph Interaction Network for Video
Question Answering,
CVPR21(15521-15530)
IEEE DOI
2111
Bridges, Visualization, Computational modeling, Message passing,
Semantics, Benchmark testing, Linguistics
BibRef
Chen, X.W.[Xuan-Wei],
Liu, R.[Rui],
Song, X.M.[Xiao-Meng],
Han, Y.H.[Ya-Hong],
Locating Visual Explanations for Video Question Answering,
MMMod21(I:290-302).
Springer DOI
2106
BibRef
Garcia, N.[Noa],
Nakashima, Y.[Yuta],
Knowledge-based Video Question Answering with Unsupervised Scene
Descriptions,
ECCV20(XVIII:581-598).
Springer DOI
2012
BibRef
Kim, J.,
Ma, M.,
Pham, T.,
Kim, K.,
Yoo, C.D.,
Modality Shifting Attention Network for Multi-Modal Video Question
Answering,
CVPR20(10103-10112)
IEEE DOI
2008
Cognition, Visualization, Task analysis, Knowledge discovery,
Proposals, Modulation, Context modeling
BibRef
Jiang, M.,
Chen, S.,
Yang, J.,
Zhao, Q.,
Fantastic Answers and Where to Find Them: Immersive Question-Directed
Visual Attention,
CVPR20(2977-2986)
IEEE DOI
2008
Task analysis, Videos, Visualization, Computational modeling, Head, Resists
BibRef
Yang, Z.,
Garcia, N.,
Chu, C.,
Otani, M.,
Nakashima, Y.,
Takemura, H.,
BERT Representations for Video Question Answering,
WACV20(1545-1554)
IEEE DOI
2006
Visualization, Bit error rate, Feature extraction,
Knowledge discovery, Task analysis, Semantics, Standards
BibRef
Fan, C.Y.[Chen-You],
Zhang, X.F.[Xiao-Fan],
Zhang, S.[Shu],
Wang, W.S.[Wen-Sheng],
Zhang, C.[Chi],
Huang, H.[Heng],
Heterogeneous Memory Enhanced Multimodal Attention Model for Video
Question Answering,
CVPR19(1999-2007).
IEEE DOI
2002
BibRef
Kim, J.Y.[Jun-Yeong],
Ma, M.[Minuk],
Kim, K.[Kyungsu],
Kim, S.[Sungjin],
Yoo, C.D.[Chang D.],
Progressive Attention Memory Network for Movie Story Question Answering,
CVPR19(8329-8338).
IEEE DOI
2002
BibRef
Liu, C.N.[Chao-Ning],
Chen, D.J.[Ding-Jie],
Chen, H.T.[Hwann-Tzong],
Liu, T.L.[Tyng-Luh],
A2A: Attention to Attention Reasoning for Movie Question Answering,
ACCV18(VI:404-419).
Springer DOI
1906
BibRef
Gao, J.,
Ge, R.,
Chen, K.,
Nevatia, R.,
Motion-Appearance Co-memory Networks for Video Question Answering,
CVPR18(6576-6585)
IEEE DOI
1812
Knowledge discovery, Cognition, Task analysis, Dynamics,
Memory modules, Micromechanical devices, Logic gates
BibRef
Kim, K.M.[Kyung-Min],
Choi, S.H.[Seong-Ho],
Kim, J.H.[Jin-Hwa],
Zhang, B.T.[Byoung-Tak],
Multimodal Dual Attention Memory for Video Story Question Answering,
ECCV18(XV: 698-713).
Springer DOI
1810
BibRef
Yu, Y.J.[Young-Jae],
Kim, J.S.[Jong-Seok],
Kim, G.[Gunhee],
A Joint Sequence Fusion Model for Video Question Answering and
Retrieval,
ECCV18(VII: 487-503).
Springer DOI
1810
BibRef
Hasan Chowdhury, M.I.,
Nguyen, K.,
Sridharan, S.,
Fookes, C.,
Hierarchical Relational Attention for Video Question Answering,
ICIP18(599-603)
IEEE DOI
1809
Feature extraction, Knowledge discovery, Visualization,
Task analysis, Mathematical model, Natural languages, scene understanding
BibRef
Mun, J.[Jonghwan],
Seo, P.H.[Paul Hongsuck],
Jung, I.[Ilchae],
Han, B.H.[Bo-Hyung],
MarioQA: Answering Questions by Watching Gameplay Videos,
ICCV17(2886-2894)
IEEE DOI
1802
computer games, inference mechanisms, neural nets,
question answering (information retrieval), VideoQA problems, Visualization
BibRef
Yu, Y.,
Ko, H.,
Choi, J.,
Kim, G.,
End-to-End Concept Word Detection for Video Captioning, Retrieval,
and Question Answering,
CVPR17(3261-3269)
IEEE DOI
1711
Detectors, Knowledge discovery, Motion pictures, Semantics, Training,
Visualization
BibRef
Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Visual Question Answering, Datasets, Benchmarks, Surveys .