Yu, J.,
Li, J.,
Yu, Z.,
Huang, Q.,
Multimodal Transformer With Multi-View Visual Representation for
Image Captioning,
CirSysVideo(30), No. 12, December 2020, pp. 4467-4480.
IEEE DOI
2012
Visualization, Feature extraction, Hidden Markov models,
Adaptation models, Task analysis, Decoding, Computational modeling,
deep learning
BibRef
Zhang, Y.[Yu],
Shi, X.Y.[Xin-Yu],
Mi, S.[Siya],
Yang, X.[Xu],
Image captioning with transformer and knowledge graph,
PRL(143), 2021, pp. 43-49.
Elsevier DOI
2102
Image captioning, Transformer, Knowledge graph
BibRef
Yan, C.G.[Cheng-Gang],
Hao, Y.M.[Yi-Ming],
Li, L.[Liang],
Yin, J.[Jian],
Liu, A.[Anan],
Mao, Z.[Zhendong],
Chen, Z.Y.[Zhen-Yu],
Gao, X.Y.[Xing-Yu],
Task-Adaptive Attention for Image Captioning,
CirSysVideo(32), No. 1, January 2022, pp. 43-51.
IEEE DOI
2201
Task analysis, Visualization, Feature extraction, Decoding,
Computational modeling, Adaptation models, Feeds, Image captioning,
transformer
BibRef
Yuan, J.[Jin],
Zhu, S.[Shuai],
Huang, S.Y.[Shu-Yin],
Zhang, H.W.[Han-Wang],
Xiao, Y.Q.[Yao-Qiang],
Li, Z.Y.[Zhi-Yong],
Wang, M.[Meng],
Discriminative Style Learning for Cross-Domain Image Captioning,
IP(31), 2022, pp. 1723-1736.
IEEE DOI
2202
Decoding, Visualization, Syntactics, Semantics, Training, Logic gates,
Birds, Cross-domain, image captioning, style, instruction-based LSTM
BibRef
Zhou, Y.[Yuanen],
Zhang, Y.[Yong],
Hu, Z.Z.[Zhen-Zhen],
Wang, M.[Meng],
Semi-Autoregressive Transformer for Image Captioning,
CLVL21(3132-3136)
IEEE DOI
2112
Training, Degradation, Codes, Benchmark testing, Transformers
BibRef
Ren, Z.H.[Zi-Hao],
Gou, S.P.[Shui-Ping],
Guo, Z.[Zhang],
Mao, S.S.[Sha-Sha],
Li, R.M.[Rui-Min],
A Mask-Guided Transformer Network with Topic Token for Remote Sensing
Image Captioning,
RS(14), No. 12, 2022, pp. xx-yy.
DOI Link
2206
BibRef
Ji, J.Y.[Jia-Yi],
Ma, Y.[Yiwei],
Sun, X.S.[Xiao-Shuai],
Zhou, Y.[Yiyi],
Wu, Y.J.[Yong-Jian],
Ji, R.R.[Rong-Rong],
Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image
Captioning,
IP(31), 2022, pp. 4321-4335.
IEEE DOI
2207
Integrated circuit modeling, Visualization, Training,
Task analysis, Measurement, Transformers, Computational modeling,
Effective CIDEr
BibRef
Li, X.[Xuan],
Zhang, W.K.[Wen-Kai],
Sun, X.[Xian],
Gao, X.[Xin],
Semantic-meshed and content-guided transformer for image captioning,
IET-CV(16), No. 5, 2022, pp. 431-444.
DOI Link
2207
computer vision, image annotation, natural language processing
BibRef
Xian, T.T.[Tian-Tao],
Li, Z.X.[Zhi-Xin],
Tang, Z.J.[Zhen-Jun],
Ma, H.F.[Hui-Fang],
Adaptive Path Selection for Dynamic Image Captioning,
CirSysVideo(32), No. 9, September 2022, pp. 5762-5775.
IEEE DOI
2209
Visualization, Feature extraction, Transformers, Semantics,
Computational modeling, Adaptation models, Computer architecture,
dynamic routing mechanism
BibRef
Yuan, Z.H.[Zhi-Hao],
Yan, X.[Xu],
Liao, Y.H.[Ying-Hong],
Guo, Y.[Yao],
Li, G.B.[Guan-Bin],
Cui, S.G.[Shu-Guang],
Li, Z.[Zhen],
X-Trans2Cap:
Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning,
CVPR22(8553-8563)
IEEE DOI
2210
Point cloud compression, Training, Visualization,
Natural languages, Network architecture, Transformers, Visual reasoning
BibRef
Liu, B.[Bing],
Wang, D.[Dong],
Yang, X.[Xu],
Zhou, Y.[Yong],
Yao, R.[Rui],
Shao, Z.W.[Zhi-Wen],
Zhao, J.Q.[Jia-Qi],
Show, Deconfound and Tell: Image Captioning with Causal Inference,
CVPR22(18020-18029)
IEEE DOI
2210
Training, Visualization, Correlation, Object detection, Linguistics,
Transformers, Encoding, Vision + language, Computer vision theory,
Visual reasoning
BibRef
Fang, Z.Y.[Zhi-Yuan],
Wang, J.F.[Jian-Feng],
Hu, X.W.[Xiao-Wei],
Liang, L.[Lin],
Gan, Z.[Zhe],
Wang, L.J.[Li-Juan],
Yang, Y.Z.[Ye-Zhou],
Liu, Z.C.[Zi-Cheng],
Injecting Semantic Concepts into End-to-End Image Captioning,
CVPR22(17988-17998)
IEEE DOI
2210
Training, Computational modeling, Semantics, Computer architecture,
Feature extraction, Transformers, Market research,
Vision applications and systems
BibRef
Li, Y.[Yehao],
Pan, Y.[Yingwei],
Yao, T.[Ting],
Mei, T.[Tao],
Comprehending and Ordering Semantics for Image Captioning,
CVPR22(17969-17978)
IEEE DOI
2210
Visualization, Codes, Semantics, Computer architecture, Linguistics,
Transformers, Vision + language
BibRef
Hu, X.W.[Xiao-Wei],
Gan, Z.[Zhe],
Wang, J.F.[Jian-Feng],
Yang, Z.Y.[Zheng-Yuan],
Liu, Z.C.[Zi-Cheng],
Lu, Y.[Yumao],
Wang, L.J.[Li-Juan],
Scaling Up Vision-Language Pretraining for Image Captioning,
CVPR22(17959-17968)
IEEE DOI
2210
Training, Visualization, Computational modeling, Training data,
Benchmark testing, Transformers, Feature extraction, Vision + language
BibRef
Fei, Z.C.[Zheng-Cong],
Yan, X.[Xu],
Wang, S.H.[Shu-Hui],
Tian, Q.[Qi],
DeeCap: Dynamic Early Exiting for Efficient Image Captioning,
CVPR22(12206-12216)
IEEE DOI
2210
Learning systems, Computational modeling, Semantics, Merging,
Predictive models, Transformers, Decoding,
Vision + language
BibRef
Wu, M.R.[Ming-Rui],
Zhang, X.Y.[Xu-Ying],
Sun, X.S.[Xiao-Shuai],
Zhou, Y.[Yiyi],
Chen, C.[Chao],
Gu, J.X.[Jia-Xin],
Sun, X.[Xing],
Ji, R.R.[Rong-Rong],
DIFNet: Boosting Visual Information Flow for Image Captioning,
CVPR22(17999-18008)
IEEE DOI
2210
Integrated circuits, Visualization, Image segmentation,
Feature extraction, Boosting, Transformers, Decoding, Vision + language
BibRef
Rio-Torto, I.[Isabel],
Cardoso, J.S.[Jaime S.],
Teixeira, L.F.[Luís F.],
From Captions to Explanations: A Multimodal Transformer-based
Architecture for Natural Language Explanation Generation,
IbPRIA22(54-65).
Springer DOI
2205
BibRef
Chen, H.S.[Hai-Shun],
Wang, Y.[Ying],
Yang, X.[Xin],
Li, J.[Jie],
Captioning Transformer With Scene Graph Guiding,
ICIP21(2538-2542)
IEEE DOI
2201
Measurement, Visualization, Image processing, Semantics,
Neural networks, Decoding, Image captioning, Scene graph, Attention,
Deep Neural Network
BibRef
Zhang, P.C.[Peng-Chuan],
Li, X.J.[Xiu-Jun],
Hu, X.W.[Xiao-Wei],
Yang, J.W.[Jian-Wei],
Zhang, L.[Lei],
Wang, L.J.[Li-Juan],
Choi, Y.J.[Ye-Jin],
Gao, J.F.[Jian-Feng],
VinVL: Revisiting Visual Representations in Vision-Language Models,
CVPR21(5575-5584)
IEEE DOI
2111
Training, Visualization, Computational modeling, Object detection,
Benchmark testing, Feature extraction, Transformers
BibRef
Zhang, X.Y.[Xu-Ying],
Sun, X.S.[Xiao-Shuai],
Luo, Y.P.[Yun-Peng],
Ji, J.Y.[Jia-Yi],
Zhou, Y.[Yiyi],
Wu, Y.J.[Yong-Jian],
Huang, F.Y.[Fei-Yue],
Ji, R.R.[Rong-Rong],
RSTNet:
Captioning with Adaptive Attention on Visual and Non-Visual Words,
CVPR21(15460-15469)
IEEE DOI
2111
Geometry, Visualization, Adaptation models, Predictive models,
Transformers, Time measurement, Servers
BibRef
He, S.[Sen],
Liao, W.T.[Wen-Tong],
Tavakoli, H.R.[Hamed R.],
Yang, M.[Michael],
Rosenhahn, B.[Bodo],
Pugeault, N.[Nicolas],
Image Captioning Through Image Transformer,
ACCV20(IV:153-169).
Springer DOI
2103
BibRef
Cornia, M.,
Stefanini, M.,
Baraldi, L.,
Cucchiara, R.,
Meshed-Memory Transformer for Image Captioning,
CVPR20(10575-10584)
IEEE DOI
2008
Decoding, Encoding, Visualization, Image coding,
Computer architecture, Proposals, Task analysis
BibRef
Li, G.,
Zhu, L.,
Liu, P.,
Yang, Y.,
Entangled Transformer for Image Captioning,
ICCV19(8927-8936)
IEEE DOI
2004
image retrieval, learning (artificial intelligence),
natural language processing, recurrent neural nets, robot vision, Proposals
BibRef
Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
Semantic Correspondence, Semantic Alignment .