Feng, Y.S.[Yan-Song],
Lapata, M.,
Automatic Caption Generation for News Images,
PAMI(35), No. 4, April 2013, pp. 797-812.
IEEE DOI
1303
Use existing captions and tags, expand to similar images.
BibRef
Vinyals, O.[Oriol],
Toshev, A.[Alexander],
Bengio, S.[Samy],
Erhan, D.[Dumitru],
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning
Challenge,
PAMI(39), No. 4, April 2017, pp. 652-663.
IEEE DOI
1703
BibRef
Earlier:
Show and tell: A neural image caption generator,
CVPR15(3156-3164)
IEEE DOI
1510
Computational modeling
BibRef
Wang, J.Y.[Jing-Ya],
Zhu, X.T.[Xia-Tian],
Gong, S.G.[Shao-Gang],
Discovering visual concept structure with sparse and incomplete tags,
AI(250), No. 1, 2017, pp. 16-36.
Elsevier DOI
1708
Automatically discovering the semantic structure of tagged visual data
(e.g. web videos and images).
BibRef
Kilickaya, M.[Mert],
Akkus, B.K.[Burak Kerim],
Cakici, R.[Ruket],
Erdem, A.[Aykut],
Erdem, E.[Erkut],
Ikizler-Cinbis, N.[Nazli],
Data-driven image captioning via salient region discovery,
IET-CV(11), No. 6, September 2017, pp. 398-406.
DOI Link
1709
BibRef
He, X.D.[Xiao-Dong],
Deng, L.[Li],
Deep Learning for Image-to-Text Generation: A Technical Overview,
SPMag(34), No. 6, November 2017, pp. 109-116.
IEEE DOI
1712
BibRef
And:
Errata:
SPMag(35), No. 1, January 2018, pp. 178.
IEEE DOI Artificial intelligence, Image classification,
Natural language processing, Pediatrics, Semantics, Training data,
Visualization
BibRef
Li, L.H.[Ling-Hui],
Tang, S.[Sheng],
Zhang, Y.D.[Yong-Dong],
Deng, L.X.[Li-Xi],
Tian, Q.[Qi],
GLA: Global-Local Attention for Image Description,
MultMed(20), No. 3, March 2018, pp. 726-737.
IEEE DOI
1802
Computational modeling, Decoding, Feature extraction,
Image recognition, Natural language processing,
recurrent neural network
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Li, X.,
Exploring Models and Data for Remote Sensing Image Caption Generation,
GeoRS(56), No. 4, April 2018, pp. 2183-2195.
IEEE DOI
1804
Feature extraction, Image representation,
Recurrent neural networks, Remote sensing, Semantics,
semantic understanding
BibRef
Wu, C.L.[Chun-Lei],
Wei, Y.W.[Yi-Wei],
Chu, X.L.[Xiao-Liang],
Su, F.[Fei],
Wang, L.Q.[Lei-Quan],
Modeling visual and word-conditional semantic attention for image
captioning,
SP:IC(67), 2018, pp. 100-107.
Elsevier DOI
1808
Image captioning, Word-conditional semantic attention,
Visual attention, Attention variation
BibRef
Zhang, M.,
Yang, Y.,
Zhang, H.,
Ji, Y.,
Shen, H.T.,
Chua, T.,
More is Better: Precise and Detailed Image Captioning Using Online
Positive Recall and Missing Concepts Mining,
IP(28), No. 1, January 2019, pp. 32-44.
IEEE DOI
1810
data mining, image representation, image retrieval,
image segmentation, learning (artificial intelligence),
element-wise selection
BibRef
Gella, S.[Spandana],
Keller, F.[Frank],
Lapata, M.[Mirella],
Disambiguating Visual Verbs,
PAMI(41), No. 2, February 2019, pp. 311-322.
IEEE DOI
1901
Given an image and a verb, assign the correct sense of the verb.
Visualization, Image recognition, Semantics,
Natural language processing, Horses, Bicycles,
BibRef
Xu, N.[Ning],
Liu, A.A.[An-An],
Liu, J.[Jing],
Nie, W.Z.[Wei-Zhi],
Su, Y.T.[Yu-Ting],
Scene graph captioner:
Image captioning based on structural visual representation,
JVCIR(58), 2019, pp. 477-485.
Elsevier DOI
1901
Image captioning, Scene graph, Structural representation, Attention
BibRef
He, X.W.[Xin-Wei],
Shi, B.G.[Bao-Guang],
Bai, X.[Xiang],
Xia, G.S.[Gui-Song],
Zhang, Z.X.[Zhao-Xiang],
Dong, W.S.[Wei-Sheng],
Image Caption Generation with Part of Speech Guidance,
PRL(119), 2019, pp. 229-237.
Elsevier DOI
1902
Image caption generation, Part-of-speech tags,
Long Short-Term Memory, Visual attributes
BibRef
Xiao, X.Y.[Xin-Yu],
Wang, L.F.[Ling-Feng],
Ding, K.[Kun],
Xiang, S.M.[Shi-Ming],
Pan, C.H.[Chun-Hong],
Dense semantic embedding network for image captioning,
PR(90), 2019, pp. 285-296.
Elsevier DOI
1903
Image captioning, Retrieval, High-level semantic information,
Visual concept, Densely embedding, Long short-term memory
BibRef
Zhang, X.R.[Xiang-Rong],
Wang, X.[Xin],
Tang, X.[Xu],
Zhou, H.Y.[Hui-Yu],
Li, C.[Chen],
Description Generation for Remote Sensing Images Using Attribute
Attention Mechanism,
RS(11), No. 6, 2019, pp. xx-yy.
DOI Link
1903
BibRef
Ding, S.T.[Song-Tao],
Qu, S.[Shiru],
Xi, Y.L.[Yu-Ling],
Sangaiah, A.K.[Arun Kumar],
Wan, S.H.[Shao-Hua],
Image caption generation with high-level image features,
PRL(123), 2019, pp. 89-95.
Elsevier DOI
1906
Image captioning, Language model,
Bottom-up attention mechanism, Faster R-CNN
BibRef
Liu, X.X.[Xiao-Xiao],
Xu, Q.Y.[Qing-Yang],
Wang, N.[Ning],
A survey on deep neural network-based image captioning,
VC(35), No. 3, March 2019, pp. 445-470.
WWW Link.
1906
BibRef
Hossain, M.Z.[Md. Zakir],
Sohel, F.[Ferdous],
Shiratuddin, M.F.[Mohd Fairuz],
Laga, H.[Hamid],
A Comprehensive Survey of Deep Learning for Image Captioning,
Surveys(51), No. 6, February 2019, pp. Article No 118.
DOI Link
1906
Survey, Captioning.
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
High-Quality Image Captioning With Fine-Grained and Semantic-Guided
Visual Attention,
MultMed(21), No. 7, July 2019, pp. 1681-1693.
IEEE DOI
1906
BibRef
Earlier:
Fine-Grained and Semantic-Guided Visual Attention for Image
Captioning,
WACV18(1709-1717)
IEEE DOI
1806
Visualization, Semantics, Feature extraction, Decoding,
Task analysis, Object oriented modeling, Image resolution,
fully convolutional network-long short term memory framework.
feedforward neural nets, image representation,
image segmentation, convolutional neural network,
Visualization
BibRef
Li, X.,
Jiang, S.,
Know More Say Less: Image Captioning Based on Scene Graphs,
MultMed(21), No. 8, August 2019, pp. 2117-2130.
IEEE DOI
1908
convolutional neural nets, feature extraction, graph theory,
image representation, learning (artificial intelligence),
vision-language
BibRef
Sharif, N.[Naeha],
White, L.[Lyndon],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
LCEval: Learned Composite Metric for Caption Evaluation,
IJCV(127), No. 10, October 2019, pp. 1586-1610.
Springer DOI
1909
Fine-grained analysis.
BibRef
Zhang, Z.Y.[Zheng-Yuan],
Diao, W.H.[Wen-Hui],
Zhang, W.K.[Wen-Kai],
Yan, M.L.[Meng-Long],
Gao, X.[Xin],
Sun, X.[Xian],
LAM: Remote Sensing Image Captioning with Label-Attention Mechanism,
RS(11), No. 20, 2019, pp. xx-yy.
DOI Link
1910
BibRef
Fu, K.[Kun],
Li, Y.[Yang],
Zhang, W.K.[Wen-Kai],
Yu, H.F.[Hong-Feng],
Sun, X.[Xian],
Boosting Memory with a Persistent Memory Mechanism for Remote Sensing
Image Captioning,
RS(12), No. 11, 2020, pp. xx-yy.
DOI Link
2006
BibRef
Tan, J.H.,
Chan, C.S.,
Chuah, J.H.,
COMIC: Toward A Compact Image Captioning Model With Attention,
MultMed(21), No. 10, October 2019, pp. 2686-2696.
IEEE DOI
1910
embedded systems; feature extraction; image retrieval; matrix algebra.
BibRef
Zhou, L.,
Zhang, Y.,
Jiang, Y.,
Zhang, T.,
Fan, W.,
Re-Caption: Saliency-Enhanced Image Captioning Through Two-Phase
Learning,
IP(29), No. 1, 2020, pp. 694-709.
IEEE DOI
1910
feature extraction, image processing,
learning (artificial intelligence),
visual attribute
BibRef
Yang, L.[Liang],
Hu, H.F.[Hai-Feng],
Visual Skeleton and Reparative Attention for Part-of-Speech image
captioning system,
CVIU(189), 2019, pp. 102819.
Elsevier DOI
1911
Neural network, Visual attention, Image captioning
BibRef
Wang, J.B.[Jun-Bo],
Wang, W.[Wei],
Wang, L.[Liang],
Wang, Z.Y.[Zhi-Yong],
Feng, D.D.[David Dagan],
Tan, T.N.[Tie-Niu],
Learning Visual Relationship and Context-Aware Attention for Image
Captioning,
PR(98), 2020, pp. 107075.
Elsevier DOI
1911
Image captioning, Relational reasoning, Context-aware attention
BibRef
Xiao, X.,
Wang, L.,
Ding, K.,
Xiang, S.,
Pan, C.,
Deep Hierarchical Encoder-Decoder Network for Image Captioning,
MultMed(21), No. 11, November 2019, pp. 2942-2956.
IEEE DOI
1911
Visualization, Semantics, Hidden Markov models, Decoding,
Logic gates, Training, Computer architecture,
vision-sentence
BibRef
Jiang, T.[Teng],
Zhang, Z.[Zehan],
Yang, Y.[Yupu],
Modeling coverage with semantic embedding for image caption generation,
VC(35), No. 11, November 2018, pp. 1655-1665.
WWW Link.
1911
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Sound Active Attention Framework for Remote Sensing Image Captioning,
GeoRS(58), No. 3, March 2020, pp. 1985-2000.
IEEE DOI
2003
Active attention, remote sensing image captioning, semantic understanding
BibRef
Li, Y.Y.[Yang-Yang],
Fang, S.K.[Shuang-Kang],
Jiao, L.C.[Li-Cheng],
Liu, R.J.[Rui-Jiao],
Shang, R.H.[Rong-Hua],
A Multi-Level Attention Model for Remote Sensing Image Captions,
RS(12), No. 6, 2020, pp. xx-yy.
DOI Link
2003
What are the important things in the image.
BibRef
Chen, X.H.[Xing-Han],
Zhang, M.X.[Ming-Xing],
Wang, Z.[Zheng],
Zuo, L.[Lin],
Li, B.[Bo],
Yang, Y.[Yang],
Leveraging unpaired out-of-domain data for image captioning,
PRL(132), 2020, pp. 132-140.
Elsevier DOI
2005
Image captioning, Out-of-domain data, Deep learning
BibRef
Xu, N.,
Zhang, H.,
Liu, A.,
Nie, W.,
Su, Y.,
Nie, J.,
Zhang, Y.,
Multi-Level Policy and Reward-Based Deep Reinforcement Learning
Framework for Image Captioning,
MultMed(22), No. 5, May 2020, pp. 1372-1383.
IEEE DOI
2005
Visualization, Measurement, Task analysis, Reinforcement learning,
Optimization, Adaptation models, Semantics, Multi-level policy,
image captioning
BibRef
Guo, L.,
Liu, J.,
Lu, S.,
Lu, H.,
Show, Tell, and Polish: Ruminant Decoding for Image Captioning,
MultMed(22), No. 8, August 2020, pp. 2149-2162.
IEEE DOI
2007
Decoding, Visualization, Planning, Training, Semantics,
Reinforcement learning, Task analysis, Image captioning,
rumination
BibRef
Feng, Q.,
Wu, Y.,
Fan, H.,
Yan, C.,
Xu, M.,
Yang, Y.,
Cascaded Revision Network for Novel Object Captioning,
CirSysVideo(30), No. 10, October 2020, pp. 3413-3421.
IEEE DOI
2010
Visualization, Semantics, Task analysis, Detectors, Training,
Knowledge engineering, Feature extraction, Captioning,
semantic matching
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Ma, H.F.[Hui-Fang],
The synergy of double attention: Combine sentence-level and
word-level attention for image captioning,
CVIU(201), 2020, pp. 103068.
Elsevier DOI
2011
Image captioning, Sentence-level attention,
Word-level attention, Reinforcement learning
BibRef
Shilpa, M.[Mohankumar],
He, J.[Jun],
Zhao, Y.J.[Yi-Jia],
Sun, B.[Bo],
Yu, L.J.[Le-Jun],
Feedback evaluations to promote image captioning,
IET-IPR(14), No. 13, November 2020, pp. 3021-3027.
DOI Link
2012
BibRef
Liu, H.,
Zhang, S.,
Lin, K.,
Wen, J.,
Li, J.,
Hu, X.,
Vocabulary-Wide Credit Assignment for Training Image Captioning
Models,
IP(30), 2021, pp. 2450-2460.
IEEE DOI
2102
Training, Measurement, Task analysis, Vocabulary, Feature extraction,
Maximum likelihood estimation, Adaptation models
BibRef
Xu, N.[Ning],
Tian, H.S.[Hong-Shuo],
Wang, Y.H.[Yan-Hui],
Nie, W.Z.[Wei-Zhi],
Song, D.[Dan],
Liu, A.A.[An-An],
Liu, W.[Wu],
Coupled-dynamic learning for vision and language:
Exploring Interaction between different tasks,
PR(113), 2021, pp. 107829.
Elsevier DOI
2103
Image captioning, Image synthesis, Coupled dynamics
BibRef
Yang, L.,
Wang, H.,
Tang, P.,
Li, Q.,
CaptionNet: A Tailor-made Recurrent Neural Network for Generating
Image Descriptions,
MultMed(23), 2021, pp. 835-845.
IEEE DOI
2103
Visualization, Feature extraction, Semantics, Task analysis,
Predictive models, Computational modeling,
reinforcement learning
BibRef
Liu, A.A.[An-An],
Wang, Y.H.[Yan-Hui],
Xu, N.[Ning],
Liu, S.[Shan],
Li, X.[Xuanya],
Scene-Graph-Guided message passing network for dense captioning,
PRL(145), 2021, pp. 187-193.
Elsevier DOI
2104
Scene graph, Dense captioning, Message passing
BibRef
Zhang, L.[Le],
Zhang, Y.S.[Yan-Shuo],
Zhao, X.[Xin],
Zou, Z.X.[Ze-Xiao],
Image captioning via proximal policy optimization,
IVC(108), 2021, pp. 104126.
Elsevier DOI
2104
Image captioning, Reinforcement learning, Proximal policy optimization
BibRef
Ji, J.Z.[Jun-Zhong],
Du, Z.R.[Zhuo-Ran],
Zhang, X.D.[Xiao-Dan],
Divergent-convergent attention for image captioning,
PR(115), 2021, pp. 107928.
Elsevier DOI
2104
Image Captioning, Divergent Observation, Convergent Attention
BibRef
Wei, Y.W.[Yi-Wei],
Wu, C.L.[Chun-Lei],
Jia, Z.Y.[Zhi-Yang],
Hu, X.[XuFei],
Guo, S.[Shuang],
Shi, H.T.[Hai-Tao],
Past is important: Improved image captioning by looking back in time,
SP:IC(94), 2021, pp. 116183.
Elsevier DOI
2104
Image captioning, Reinforcement learning, Visual attention
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
Exploring region relationships implicitly:
Image captioning with visual relationship attention,
IVC(109), 2021, pp. 104146.
Elsevier DOI
2105
Image captioning, Visual relationship attention,
Relationship-level attention parallel attention mechanism,
Learned spatial constraint
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
Exploring Pairwise Relationships Adaptively From Linguistic Context
in Image Captioning,
MultMed(24), 2022, pp. 3101-3113.
IEEE DOI
2206
Visualization, Linguistics, Decoding, Modulation, Context modeling,
Adaptation models, Semantics, Bilinear attention,
visual relationship attention
BibRef
Li, X.L.[Xue-Long],
Zhang, X.T.[Xue-Ting],
Huang, W.[Wei],
Wang, Q.[Qi],
Truncation Cross Entropy Loss for Remote Sensing Image Captioning,
GeoRS(59), No. 6, June 2021, pp. 5246-5257.
IEEE DOI
2106
Feature extraction, Remote sensing, Entropy, Semantics, Decoding,
Optimization, Visualization, Image captioning, overfitting,
truncation cross entropy (TCE) loss
BibRef
Zhong, X.[Xian],
Nie, G.Z.[Guo-Zhang],
Huang, W.X.[Wen-Xin],
Liu, W.X.[Wen-Xuan],
Ma, B.[Bo],
Lin, C.W.[Chia-Wen],
Attention-guided image captioning with adaptive global and local
feature fusion,
JVCIR(78), 2021, pp. 103138.
Elsevier DOI
2107
Image captioning, Encoder-decoder, Spatial information, Adaptive attention
BibRef
Sumbul, G.[Gencer],
Nayak, S.[Sonali],
Demir, B.[Begüm],
SD-RSIC: Summarization-Driven Deep Remote Sensing Image Captioning,
GeoRS(59), No. 8, August 2021, pp. 6922-6934.
IEEE DOI
2108
Training, Standards, Semantics, Feature extraction, Remote sensing,
Neural networks, Task analysis, Caption summarization,
remote sensing (RS)
BibRef
Wu, J.[Jie],
Chen, T.S.[Tian-Shui],
Wu, H.F.[He-Feng],
Yang, Z.[Zhi],
Luo, G.C.[Guang-Chun],
Lin, L.[Liang],
Fine-Grained Image Captioning With Global-Local Discriminative
Objective,
MultMed(23), 2021, pp. 2413-2427.
IEEE DOI
2108
Training, Visualization, Task analysis, Semantics,
Reinforcement learning, Pipelines, Maximum likelihood estimation,
Self-retrieval
BibRef
Wu, L.X.[Ling-Xiang],
Xu, M.[Min],
Sang, L.[Lei],
Yao, T.[Ting],
Mei, T.[Tao],
Noise Augmented Double-Stream Graph Convolutional Networks for Image
Captioning,
CirSysVideo(31), No. 8, August 2021, pp. 3118-3127.
IEEE DOI
2108
Visualization, Training, Generators, Reinforcement learning,
Decoding, Streaming media, Recurrent neural networks, Captioning,
adaptive noise
BibRef
Nivedita, M.,
Chandrashekar, P.[Priyanka],
Mahapatra, S.[Shibani],
Phamila, Y.A.V.[Y. Asnath Victy],
Selvaperumal, S.K.[Sathish Kumar],
Image Captioning for Video Surveillance System using Neural Networks,
IJIG(21), No. 4, October 2021 2021, pp. 2150044.
DOI Link
2110
BibRef
Wang, Q.[Qi],
Huang, W.[Wei],
Zhang, X.T.[Xue-Ting],
Li, X.L.[Xue-Long],
Word-Sentence Framework for Remote Sensing Image Captioning,
GeoRS(59), No. 12, December 2021, pp. 10532-10543.
IEEE DOI
2112
Remote sensing, Feature extraction, Generators, Decoding,
Task analysis, Visualization, Semantics, Deep learning,
word-sentence framework
BibRef
Wan, B.Y.[Bo-Yang],
Jiang, W.H.[Wen-Hui],
Fang, Y.M.[Yu-Ming],
Zhu, M.W.[Min-Wei],
Li, Q.[Qin],
Liu, Y.[Yang],
Revisiting image captioning via maximum discrepancy competition,
PR(122), 2022, pp. 108358.
Elsevier DOI
2112
Image captioning, Model comparison, Attention mechanism
BibRef
Chen, T.Y.[Tian-Yu],
Li, Z.X.[Zhi-Xin],
Wu, J.L.[Jing-Li],
Ma, H.F.[Hui-Fang],
Su, B.P.[Bian-Ping],
Improving image captioning with Pyramid Attention and SC-GAN,
IVC(117), 2022, pp. 104340.
Elsevier DOI
2112
Image captioning, Pyramid Attention network,
Self-critical training, Reinforcement learning, Sequence-level learning
BibRef
Zhou, Y.J.[Yu-Jie],
Long, J.F.[Jie-Feng],
Xu, S.P.[Su-Ping],
Shang, L.[Lin],
Attribute-driven image captioning via soft-switch pointer,
PRL(152), 2021, pp. 34-41.
Elsevier DOI
2112
Image captioning, Visual attributes detection, Attention, Pointing mechanism
BibRef
Zha, Z.J.[Zheng-Jun],
Liu, D.[Daqing],
Zhang, H.W.[Han-Wang],
Zhang, Y.D.[Yong-Dong],
Wu, F.[Feng],
Context-Aware Visual Policy Network for Fine-Grained Image Captioning,
PAMI(44), No. 2, February 2022, pp. 710-722.
IEEE DOI
2201
Visualization, Task analysis, Cognition, Decision making, Training,
Natural languages, Reinforcement learning, Image captioning,
policy network
BibRef
Wang, Q.Z.[Qing-Zhong],
Wan, J.[Jia],
Chan, A.B.[Antoni B.],
On Diversity in Image Captioning: Metrics and Methods,
PAMI(44), No. 2, February 2022, pp. 1035-1049.
IEEE DOI
2201
Measurement, Semantics, Learning (artificial intelligence),
Vegetation, Legged locomotion, Training, Computational modeling,
diversity metric
BibRef
Wang, J.[Jiuniu],
Xu, W.J.[Wen-Jia],
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Compare and Reweight:
Distinctive Image Captioning Using Similar Images Sets,
ECCV20(I:370-386).
Springer DOI
2011
BibRef
Luo, G.F.[Gai-Fang],
Cheng, L.J.[Li-Jun],
Jing, C.[Chao],
Zhao, C.[Can],
Song, G.Z.[Guo-Zhu],
A thorough review of models, evaluation metrics, and datasets on
image captioning,
IET-IPR(16), No. 2, 2022, pp. 311-332.
DOI Link
2201
BibRef
Ben, H.X.[Hui-Xia],
Pan, Y.W.[Ying-Wei],
Li, Y.[Yehao],
Yao, T.[Ting],
Hong, R.C.[Ri-Chang],
Wang, M.[Meng],
Mei, T.[Tao],
Unpaired Image Captioning With semantic-Constrained Self-Learning,
MultMed(24), 2022, pp. 904-916.
IEEE DOI
2202
Semantics, Image recognition, Training, Visualization, Decoding, Task analysis,
Dogs, Encoder-decoder networks, image captioning, self-supervised learning
BibRef
Song, P.P.[Pei-Pei],
Guo, D.[Dan],
Zhou, J.X.[Jin-Xing],
Xu, M.L.[Ming-Liang],
Wang, M.[Meng],
Memorial GAN With Joint Semantic Optimization for Unpaired Image
Captioning,
Cyber(53), No. 7, July 2023, pp. 4388-4399.
IEEE DOI
2307
Semantics, Synthetic aperture sonar, Visualization, Task analysis,
Optimization, Generative adversarial networks, Correlation,
unpaired image captioning
BibRef
Li, Y.[Yehao],
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Chao, H.Y.[Hong-Yang],
Mei, T.[Tao],
Pointing Novel Objects in Image Captioning,
CVPR19(12489-12498).
IEEE DOI
2002
BibRef
Liu, M.F.[Mao-Fu],
Hu, H.J.[Hui-Jun],
Li, L.J.[Ling-Jun],
Yu, Y.[Yan],
Guan, W.L.[Wei-Li],
Chinese Image Caption Generation via Visual Attention and Topic
Modeling,
Cyber(52), No. 2, February 2022, pp. 1247-1257.
IEEE DOI
2202
Visualization, Decoding, Semantics, Predictive models,
Feature extraction, Natural language processing,
visual attention
BibRef
Yang, Q.Q.[Qiao-Qiao],
Ni, Z.H.[Zi-Hao],
Ren, P.[Peng],
Meta captioning:
A meta learning based remote sensing image captioning framework,
PandRS(186), 2022, pp. 190-200.
Elsevier DOI
2203
Remote sensing image captioning, Meta learning
BibRef
Yang, X.[Xu],
Zhang, H.W.[Han-Wang],
Cai, J.F.[Jian-Fei],
Auto-Encoding and Distilling Scene Graphs for Image Captioning,
PAMI(44), No. 5, May 2022, pp. 2313-2327.
IEEE DOI
2204
Visualization, Decoding, Training, Roads, Pipelines, Dictionaries,
Semantics, Image captioning, scene graph, transfer learning,
knowledge distillation
BibRef
Yang, X.[Xu],
Zhang, H.W.[Han-Wang],
Cai, J.F.[Jian-Fei],
Deconfounded Image Captioning: A Causal Retrospect,
PAMI(45), No. 11, November 2023, pp. 12996-13010.
IEEE DOI
2310
BibRef
Yang, X.[Xu],
Tang, K.[Kaihua],
Zhang, H.W.[Han-Wang],
Cai, J.F.[Jian-Fei],
Auto-Encoding Scene Graphs for Image Captioning,
CVPR19(10677-10686).
IEEE DOI
2002
BibRef
Yang, Z.P.[Zuo-Peng],
Wang, P.B.[Peng-Bo],
Chu, T.S.[Tian-Shu],
Yang, J.[Jie],
Human-Centric Image Captioning,
PR(126), 2022, pp. 108545.
Elsevier DOI
2204
Human-centric, Image captioning, Feature hierarchization
BibRef
Li, X.[Xuan],
Zhang, W.K.[Wen-Kai],
Sun, X.[Xian],
Gao, X.[Xin],
Without detection: Two-step clustering features with local-global
attention for image captioning,
IET-CV(16), No. 3, 2022, pp. 280-294.
DOI Link
2204
BibRef
Yu, L.T.[Li-Tao],
Zhang, J.[Jian],
Wu, Q.[Qiang],
Dual Attention on Pyramid Feature Maps for Image Captioning,
MultMed(24), No. 2022, pp. 1775-1786.
IEEE DOI
2204
Visualization, Decoding, Task analysis, Semantics,
Feature extraction, Context modeling, Image captioning,
pyramid attention
BibRef
Zhang, M.[Min],
Chen, J.X.[Jing-Xiang],
Li, P.F.[Peng-Fei],
Jiang, M.[Ming],
Zhou, Z.[Zhe],
Topic scene graphs for image captioning,
IET-CV(16), No. 4, 2022, pp. 364-375.
DOI Link
2205
natural language processing
BibRef
Yu, Q.[Qiang],
Zhang, C.X.[Chun-Xia],
Weng, L.[Lubin],
Xiang, S.M.[Shi-Ming],
Pan, C.H.[Chun-Hong],
Scene captioning with deep fusion of images and point clouds,
PRL(158), 2022, pp. 9-15.
Elsevier DOI
2205
Scene captioning, Point cloud, Deep fusion
BibRef
Chaudhari, C.P.[Chaitrali Prasanna],
Devane, S.[Satish],
Improved Framework using Rider Optimization Algorithm for Precise Image
Caption Generation,
IJIG(22), No. 2, April 2022, pp. 2250021.
DOI Link
2205
BibRef
Shao, X.J.[Xiang-Jun],
Xiang, Z.L.[Zheng-Long],
Li, Y.X.[Yuan-Xiang],
Zhang, M.J.[Ming-Jie],
Variational joint self-attention for image captioning,
IET-IPR(16), No. 8, 2022, pp. 2075-2086.
DOI Link
2205
BibRef
Li, Y.C.[Yao-Chen],
Wu, C.[Chuan],
Li, L.[Ling],
Liu, Y.H.[Yue-Hu],
Zhu, J.[Jihua],
Caption Generation From Road Images for Traffic Scene Modeling,
ITS(23), No. 7, July 2022, pp. 7805-7816.
IEEE DOI
2207
Semantics, Roads, Visualization, Feature extraction,
Image reconstruction, Vehicle dynamics, Geometric analysis,
visual relationship detection
BibRef
Wang, Y.H.[Yan-Hui],
Xu, N.[Ning],
Liu, A.A.[An-An],
Li, W.H.[Wen-Hui],
Zhang, Y.D.[Yong-Dong],
High-Order Interaction Learning for Image Captioning,
CirSysVideo(32), No. 7, July 2022, pp. 4417-4430.
IEEE DOI
2207
Visualization, Semantics, Feature extraction, Decoding,
Task analysis, Ions, Encoding, Image captioning,
encoder-decoder framework
BibRef
Guo, D.D.[Dan-Dan],
Lu, R.Y.[Rui-Ying],
Chen, B.[Bo],
Zeng, Z.Q.[Ze-Qun],
Zhou, M.Y.[Ming-Yuan],
Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning,
IJCV(130), No. 8, August 2022, pp. 1920-1937.
Springer DOI
2207
BibRef
Demirel, B.[Berkan],
Cinbis, R.G.[Ramazan Gokberk],
Caption generation on scenes with seen and unseen object categories,
IVC(124), 2022, pp. 104515.
Elsevier DOI
2208
Zero-shot learning, Zero-shot image captioning
BibRef
Liu, Z.Y.[Zong-Yin],
Dong, A.M.[An-Ming],
Yu, J.G.[Ji-Guo],
Han, Y.B.[Yu-Bing],
Zhou, Y.[You],
Zhao, K.[Kai],
Scene classification for remote sensing images with self-attention
augmented CNN,
IET-IPR(16), No. 11, 2022, pp. 3085-3096.
DOI Link
2208
BibRef
Wu, X.X.[Xin-Xiao],
Zhao, W.T.[Wen-Tian],
Luo, J.B.[Jie-Bo],
Learning Cooperative Neural Modules for Stylized Image Captioning,
IJCV(130), No. 9, September 2022, pp. 2305-2320.
Springer DOI
2208
BibRef
Zhou, H.[Haonan],
Du, X.P.[Xiao-Ping],
Xia, L.[Lurui],
Li, S.[Sen],
Self-Learning for Few-Shot Remote Sensing Image Captioning,
RS(14), No. 18, 2022, pp. xx-yy.
DOI Link
2209
BibRef
Stefanini, M.[Matteo],
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cascianelli, S.[Silvia],
Fiameni, G.[Giuseppe],
Cucchiara, R.[Rita],
From Show to Tell: A Survey on Deep Learning-Based Image Captioning,
PAMI(45), No. 1, January 2023, pp. 539-559.
IEEE DOI
2212
Survey, Image Captions. Visualization, Feature extraction, Task analysis,
Convolutional neural networks, Additives, Image coding, Training
BibRef
Wu, Y.[Yu],
Jiang, L.[Lu],
Yang, Y.[Yi],
Switchable Novel Object Captioner,
PAMI(45), No. 1, January 2023, pp. 1162-1173.
IEEE DOI
2212
Training, Visualization, Switches, Task analysis, Training data,
Decoding, Convolutional neural networks, Image captioning, zero-shot learning
BibRef
Yang, X.[Xu],
Zhang, H.W.[Han-Wang],
Gao, C.Y.[Chong-Yang],
Cai, J.F.[Jian-Fei],
Learning to Collocate Visual-Linguistic Neural Modules for Image
Captioning,
IJCV(131), No. 1, January 2023, pp. 82-100.
Springer DOI
2301
BibRef
Earlier: A1, A2, A4, Only:
Learning to Collocate Neural Modules for Image Captioning,
ICCV19(4249-4259)
IEEE DOI
2004
image processing, learning (artificial intelligence),
natural language processing, neural nets, Neural networks
BibRef
Ma, Y.W.[Yi-Wei],
Ji, J.Y.[Jia-Yi],
Sun, X.S.[Xiao-Shuai],
Zhou, Y.[Yiyi],
Ji, R.R.[Rong-Rong],
Towards local visual modeling for image captioning,
PR(138), 2023, pp. 109420.
Elsevier DOI
2303
Image captioning, Attention mechanism, Local visual modeling
BibRef
Barati, A.[Alireza],
Farsi, H.[Hassan],
Mohamadzadeh, S.[Sajad],
Integration of the latent variable knowledge into deep image
captioning with Bayesian modeling,
IET-IPR(17), No. 7, 2023, pp. 2256-2271.
DOI Link
2305
attention mechanism, automatic image captioning,
deep neural networks, high-level semantic concepts, latent variable
BibRef
Feng, J.L.[Jun-Long],
Zhao, J.P.[Jian-Ping],
Effectively Utilizing the Category Labels for Image Captioning,
IEICE(E106-D), No. 5, May 2023, pp. 617-624.
WWW Link.
2305
BibRef
Wang, D.P.[De-Peng],
Hu, Z.Z.[Zhen-Zhen],
Zhou, Y.[Yuanen],
Hong, R.C.[Ri-Chang],
Wang, M.[Meng],
A Text-Guided Generation and Refinement Model for Image Captioning,
MultMed(25), 2023, pp. 2966-2977.
IEEE DOI
2309
BibRef
Wang, Q.[Qi],
Huang, W.[Wei],
Zhang, X.T.[Xue-Ting],
Li, X.L.[Xue-Long],
GLCM: Global-Local Captioning Model for Remote Sensing Image
Captioning,
Cyber(53), No. 11, November 2023, pp. 6910-6922.
IEEE DOI
2310
BibRef
Ji, J.Y.[Jia-Yi],
Huang, X.Y.[Xiao-Yang],
Sun, X.S.[Xiao-Shuai],
Zhou, Y.[Yiyi],
Luo, G.[Gen],
Cao, L.J.[Liu-Juan],
Liu, J.Z.[Jian-Zhuang],
Shao, L.[Ling],
Ji, R.R.[Rong-Rong],
Multi-Branch Distance-Sensitive Self-Attention Network for Image
Captioning,
MultMed(25), 2023, pp. 3962-3974.
IEEE DOI
2310
BibRef
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Tal, A.[Ayellet],
Cucchiara, R.[Rita],
Fully-attentive iterative networks for region-based controllable
image and video captioning,
CVIU(237), 2023, pp. 103857.
Elsevier DOI
2311
Controllable captioning, Image captioning, Video captioning, Vision-and-language
BibRef
Al-Qatf, M.[Majjed],
Wang, X.[Xingfu],
Hawbani, A.[Ammar],
Abdussalam, A.[Amr],
Alsamhi, S.H.[Saeed Hammod],
Image Captioning With Novel Topics Guidance and Retrieval-Based
Topics Re-Weighting,
MultMed(25), 2023, pp. 5984-5999.
IEEE DOI
2311
BibRef
Zhu, P.P.[Pei-Pei],
Wang, X.[Xiao],
Luo, Y.[Yong],
Sun, Z.L.[Zheng-Long],
Zheng, W.S.[Wei-Shi],
Wang, Y.[Yaowei],
Chen, C.[Changwen],
Unpaired Image Captioning by Image-Level Weakly-Supervised Visual
Concept Recognition,
MultMed(25), 2023, pp. 6702-6716.
IEEE DOI
2311
BibRef
Hu, N.N.[Nan-Nan],
Ming, Y.[Yue],
Fan, C.X.[Chun-Xiao],
Feng, F.[Fan],
Lyu, B.Y.[Bo-Yang],
TSFNet: Triple-Steam Image Captioning,
MultMed(25), 2023, pp. 6904-6916.
IEEE DOI
2311
BibRef
González-Chávez, O.[Othón],
Ruiz, G.[Guillermo],
Moctezuma, D.[Daniela],
Ramirez-delReal, T.[Tania],
Are metrics measuring what they should? An evaluation of Image
Captioning task metrics,
SP:IC(120), 2024, pp. 117071.
Elsevier DOI
2312
Metrics, Image Captioning, Image understanding, Language model
BibRef
Padate, R.[Roshni],
Jain, A.[Amit],
Kalla, M.[Mukesh],
Sharma, A.[Arvind],
A Widespread Assessment and Open Issues on Image Captioning Models,
IJIG(23), No. 6 2023, pp. 2350057.
DOI Link
2312
BibRef
Shao, Z.[Zhuang],
Han, J.G.[Jun-Gong],
Debattista, K.[Kurt],
Pang, Y.W.[Yan-Wei],
Textual Context-Aware Dense Captioning With Diverse Words,
MultMed(25), 2023, pp. 8753-8766.
IEEE DOI
2312
BibRef
Cheng, J.[Jun],
Wu, F.[Fuxiang],
Liu, L.[Liu],
Zhang, Q.[Qieshi],
Rutkowski, L.[Leszek],
Tao, D.C.[Da-Cheng],
InDecGAN: Learning to Generate Complex Images From Captions via
Independent Object-Level Decomposition and Enhancement,
MultMed(25), 2023, pp. 8279-8293.
IEEE DOI
2312
BibRef
Ding, N.[Ning],
Deng, C.R.[Chao-Rui],
Tan, M.K.[Ming-Kui],
Du, Q.[Qing],
Ge, Z.W.[Zhi-Wei],
Wu, Q.[Qi],
Image Captioning With Controllable and Adaptive Length Levels,
PAMI(46), No. 2, February 2024, pp. 764-779.
IEEE DOI
2401
Length-controllable image captioning, non-autoregressive image captioning,
length level reranking, refinement-enhanced sequence training
BibRef
Xu, G.H.[Guang-Hui],
Niu, S.C.[Shuai-Cheng],
Tan, M.K.[Ming-Kui],
Luo, Y.C.[Yu-Cheng],
Du, Q.[Qing],
Wu, Q.[Qi],
Towards Accurate Text-based Image Captioning with Content Diversity
Exploration,
CVPR21(12632-12641)
IEEE DOI
2111
Visualization, Image resolution, Benchmark testing,
Proposals, Optical character recognition software
BibRef
Zhu, P.P.[Pei-Pei],
Wang, X.[Xiao],
Zhu, L.[Lin],
Sun, Z.L.[Zheng-Long],
Zheng, W.S.[Wei-Shi],
Wang, Y.[Yaowei],
Chen, C.W.[Chang-Wen],
Prompt-Based Learning for Unpaired Image Captioning,
MultMed(26), 2024, pp. 379-393.
IEEE DOI
2402
Measurement, Semantics, Task analysis, Visualization,
Adversarial machine learning, Correlation, Training, Metric prompt,
unpaired image captioning
BibRef
Liu, A.A.[An-An],
Zhai, Y.C.[Ying-Chen],
Xu, N.[Ning],
Tian, H.[Hongshuo],
Nie, W.Z.[Wei-Zhi],
Zhang, Y.D.[Yong-Dong],
Event-Aware Retrospective Learning for Knowledge-Based Image
Captioning,
MultMed(26), 2024, pp. 4898-4911.
IEEE DOI
2404
Visualization, Knowledge engineering, Knowledge based systems,
Correlation, Semantics, Genomics, Bioinformatics, Image captioning,
retrospective learning
BibRef
Song, L.F.[Li-Fei],
Li, F.[Fei],
Wang, Y.[Ying],
Liu, Y.[Yu],
Wang, Y.[Yuanhua],
Xiang, S.M.[Shi-Ming],
Image captioning: Semantic selection unit with stacked residual
attention,
IVC(144), 2024, pp. 104965.
Elsevier DOI
2404
Image captioning, Semantic attributes, Semantic selection unit,
Transformer, Stacked residual attention
BibRef
Ajankar, S.[Sonali],
Dutta, T.[Tanima],
Image-Relevant Entities Knowledge-Aware News Image Captioning,
MultMedMag(31), No. 1, January 2024, pp. 88-98.
IEEE DOI
2404
Decoding, Task analysis, Feature extraction, Visualization, Encoding,
Internet, Encyclopedias, Publishing, Image capture, Online services,
Multisensory integration
BibRef
Dai, Z.Z.[Zhuang-Zhuang],
Tran, V.[Vu],
Markham, A.[Andrew],
Trigoni, N.[Niki],
Rahman, M.A.[M. Arif],
Wijayasingha, L.N.S.,
Stankovic, J.[John],
Li, C.[Chen],
EgoCap and EgoFormer:
First-person image captioning with context fusion,
PRL(181), 2024, pp. 50-56.
Elsevier DOI Code:
WWW Link.
2405
Image captioning, Storytelling, Dataset
BibRef
Shao, Z.[Zhuang],
Han, J.G.[Jun-Gong],
Debattista, K.[Kurt],
Pang, Y.W.[Yan-Wei],
DCMSTRD: End-to-end Dense Captioning via Multi-Scale Transformer
Decoding,
MultMed(26), 2024, pp. 7581-7593.
IEEE DOI
2405
Decoding, Transformers, Visualization, Feature extraction,
Task analysis, Computer architecture, Training, Dense captioning,
multi-scale language decoder (MSLD)
BibRef
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Fiameni, G.[Giuseppe],
Cucchiara, R.[Rita],
Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets,
IJCV(132), No. 5, May 2024, pp. 1701-1720.
Springer DOI
2405
BibRef
Barraco, M.[Manuele],
Sarto, S.[Sara],
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
With a Little Help from your own Past: Prototypical Memory Networks
for Image Captioning,
ICCV23(3009-3019)
IEEE DOI Code:
WWW Link.
2401
BibRef
Barraco, M.[Manuele],
Stefanini, M.[Matteo],
Cornia, M.[Marcella],
Cascianelli, S.[Silvia],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
CaMEL: Mean Teacher Learning for Image Captioning,
ICPR22(4087-4094)
IEEE DOI
2212
Training, Measurement, Knowledge engineering, Visualization,
Source coding, Natural languages, Feature extraction
BibRef
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
Show, Control and Tell: A Framework for Generating Controllable and
Grounded Captions,
CVPR19(8299-8308).
IEEE DOI
2002
BibRef
Wang, L.X.[Lan-Xiao],
Qiu, H.Q.[He-Qian],
Qiu, B.[Benliu],
Meng, F.M.[Fan-Man],
Wu, Q.B.[Qing-Bo],
Li, H.L.[Hong-Liang],
TridentCap: Image-Fact-Style Trident Semantic Framework for Stylized
Image Captioning,
CirSysVideo(34), No. 5, May 2024, pp. 3563-3575.
IEEE DOI Code:
WWW Link.
2405
Semantics, Decoding, Dogs, Task analysis, Feature extraction,
Annotations, Visualization, Stylized image captioning,
pseudo labels filter
BibRef
Zhang, H.[Haonan],
Zeng, P.P.[Peng-Peng],
Gao, L.[Lianli],
Lyu, X.Y.[Xin-Yu],
Song, J.K.[Jing-Kuan],
Shen, H.T.[Heng Tao],
SPT: Spatial Pyramid Transformer for Image Captioning,
CirSysVideo(34), No. 6, June 2024, pp. 4829-4842.
IEEE DOI Code:
WWW Link.
2406
Transformers, Visualization, Feature extraction, Semantics, Decoding,
Task analysis, Spatial resolution, Image captioning, clustering
BibRef
Wang, H.Y.[Heng-You],
Song, K.[Kani],
Jiang, X.[Xiang],
He, Z.Q.[Zhi-Quan],
ragBERT: Relationship-aligned and grammar-wise BERT model for image
captioning,
IVC(148), 2024, pp. 105105.
Elsevier DOI
2407
Image captioning, Relationship tags, Grammar, BERT
BibRef
Li, J.Y.[Jing-Yu],
Zhang, L.[Lei],
Zhang, K.[Kun],
Hu, B.[Bo],
Xie, H.T.[Hong-Tao],
Mao, Z.D.[Zhen-Dong],
Cascade Semantic Prompt Alignment Network for Image Captioning,
CirSysVideo(34), No. 7, July 2024, pp. 5266-5281.
IEEE DOI Code:
WWW Link.
2407
Semantics, Visualization, Feature extraction, Detectors,
Integrated circuit modeling, Transformers, Task analysis, prompt
BibRef
Cheng, Q.[Qimin],
Xu, Y.Q.[Yu-Qi],
Huang, Z.Y.[Zi-Yang],
VCC-DiffNet: Visual Conditional Control Diffusion Network for Remote
Sensing Image Captioning,
RS(16), No. 16, 2024, pp. 2961.
DOI Link
2408
BibRef
Zou, Y.[Yang],
Liao, S.Y.[Shi-Yu],
Wang, Q.F.[Qi-Fei],
Chinese image captioning with fusion encoder and visual keyword
search,
IET-IPR(18), No. 11, 2024, pp. 3055-3069.
DOI Link
2409
Chinese image captioning, fusion encoder, image retrieval,
sentence-level optimization, visual keyword search
BibRef
Chen, S.J.[Si-Jin],
Zhu, H.Y.[Hong-Yuan],
Li, M.S.[Ming-Sheng],
Chen, X.[Xin],
Guo, P.[Peng],
Lei, Y.J.[Yig-Jie],
Yu, G.[Gang],
Li, T.[Taihao],
Chen, T.[Tao],
Vote2Cap-DETR++: Decoupling Localization and Describing for
End-to-End 3D Dense Captioning,
PAMI(46), No. 11, November 2024, pp. 7331-7347.
IEEE DOI
2410
BibRef
Earlier: A1, A2, A4, A6, A7, A9, Only:
End-to-End 3D Dense Captioning with Vote2Cap-DETR,
CVPR23(11124-11133)
IEEE DOI
2309
Location awareness, Task analysis, Transformers, Solid modeling,
Decoding, Pipelines, 3D dense captioning, 3D scene understanding,
transformers
BibRef
Lv, F.X.[Fei-Xiao],
Wang, R.[Rui],
Jing, L.H.[Li-Hua],
Dai, P.W.[Peng-Wen],
HIST: Hierarchical and sequential transformer for image captioning,
IET-CV(18), No. 7, 2024, pp. 1043-1056.
DOI Link
2411
computer vision, feature extraction,
learning (artificial intelligence), neural nets
BibRef
Du, R.[Runyan],
Zhang, W.K.[Wen-Kai],
Li, S.[Shuoke],
Chen, J.L.[Jia-Liang],
Guo, Z.[Zhi],
Spatial guided image captioning: Guiding attention with object's
spatial interaction,
IET-IPR(18), No. 12, 2024, pp. 3368-3380.
DOI Link
2411
image representation, image texture
BibRef
Li, Y.P.[Yun-Peng],
Zhang, X.R.[Xiang-Rong],
Zhang, T.Y.[Tian-Yang],
Wang, G.C.[Guan-Chun],
Wang, X.L.[Xin-Lin],
Li, S.[Shuo],
A Patch-Level Region-Aware Module with a Multi-Label Framework for
Remote Sensing Image Captioning,
RS(16), No. 21, 2024, pp. 3987.
DOI Link
2411
BibRef
Zhang, K.[Ke],
Li, P.[Peijie],
Wang, J.Q.[Jian-Qiang],
A Review of Deep Learning-Based Remote Sensing Image Caption:
Methods, Models, Comparisons and Future Directions,
RS(16), No. 21, 2024, pp. 4113.
DOI Link
2411
BibRef
Das, S.[Subham],
Sekhar, C.C.[C. Chandra],
Leveraging Generated Image Captions for Visual Commonsense Reasoning,
ICIP24(2508-2514)
IEEE DOI
2411
Visualization, Accuracy, Image color analysis, Semantics,
Natural languages, Transformers,
Vision and Language Transformers
BibRef
Chaffin, A.[Antoine],
Kijak, E.[Ewa],
Claveau, V.[Vincent],
Distinctive Image Captioning: Leveraging Ground Truth Captions in
Clip Guided Reinforcement Learning,
ICIP24(2550-2556)
IEEE DOI
2411
Training, Vocabulary, Costs, Grounding, Computational modeling,
Reinforcement learning, Image captioning, Cross-modal retrieval,
Reinforcement learning
BibRef
Jeong, K.[Kiyoon],
Lee, W.[Woojun],
Nam, W.[Woongchan],
Ma, M.[Minjeong],
Kang, P.[Pilsung],
Technical Report of NICE Challenge at CVPR 2024: Caption Re-ranking
Evaluation Using Ensembled CLIP and Consensus Scores,
NICE24(7366-7372)
IEEE DOI Code:
WWW Link.
2410
Measurement, Training, Semantics, Pipelines, Writing,
Caption Reranking, Image Captioning, Caption Evaluation
BibRef
Kim, T.[Taehoon],
Marsden, M.[Mark],
Ahn, P.[Pyunghwan],
Kim, S.[Sangyun],
Lee, S.[Sihaeng],
Sala, A.[Alessandra],
Kim, S.H.[Seung Hwan],
Large-Scale Bidirectional Training for Zero-Shot Image Captioning,
NICE24(7373-7383)
IEEE DOI
2410
Training, Measurement, Accuracy, Computer architecture,
Feature extraction, Data models, zero-shot, image captioning,
vision-language
BibRef
Kim, T.[Taehoon],
Ahn, P.[Pyunghwan],
Kim, S.[Sangyun],
Lee, S.[Sihaeng],
Marsden, M.[Mark],
Sala, A.[Alessandra],
Kim, S.H.[Seung Hwan],
Han, B.H.[Bo-Hyung],
Lee, K.M.[Kyoung Mu],
Lee, H.L.[Hong-Lak],
Bae, K.[Kyounghoon],
Wu, X.Y.[Xiang-Yu],
Gao, Y.[Yi],
Zhang, H.L.[Hai-Liang],
Yang, Y.[Yang],
Guo, W.[Weili],
Lu, J.F.[Jian-Feng],
Oh, Y.[Youngtaek],
Cho, J.W.[Jae Won],
Kim, D.J.[Dong-Jin],
Kweon, I.S.[In So],
Kim, J.[Junmo],
Kang, W.[Wooyoung],
Jhoo, W.Y.[Won Young],
Roh, B.[Byungseok],
Mun, J.[Jonghwan],
Oh, S.[Solgil],
Ak, K.E.[Kenan Emir],
Lee, G.G.[Gwang-Gook],
Xu, Y.[Yan],
Shen, M.W.[Ming-Wei],
Hwang, K.[Kyomin],
Shin, W.S.[Won-Sik],
Lee, K.[Kamin],
Park, W.[Wonhark],
Lee, D.[Dongkwan],
Kwak, N.[Nojun],
Wang, Y.J.[Yu-Jin],
Wang, Y.[Yimu],
Gu, T.C.[Tian-Cheng],
Lv, X.C.[Xing-Chang],
Sun, M.[Mingmao],
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning,
NICE24(7356-7365)
IEEE DOI
2410
Training, Adaptation models, Visualization, Computational modeling,
Training data, Image captioning, Vision-language models,
Multimodal representation
BibRef
Urbanek, J.[Jack],
Bordes, F.[Florian],
Astolfi, P.[Pietro],
Williamson, M.[Mary],
Sharma, V.[Vasu],
Romero-Soriano, A.[Adriana],
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style
Models on Dense Captions,
CVPR24(26690-26699)
IEEE DOI
2410
Training, Visualization, Computational modeling, Benchmark testing,
Reliability
BibRef
Nebbia, G.[Giacomo],
Kovashka, A.[Adriana],
Image-caption difficulty for efficient weakly-supervised object
detection from in-the-wild data,
L3D-IVU24(2596-2605)
IEEE DOI
2410
Training, Deep learning, Costs, Filtering, Computational modeling
BibRef
Sakaino, H.[Hidetomo],
Phuong, T.N.[Thao Nguyen],
Duy, V.N.[Vinh Nguyen],
PV-Cap: 3D Dynamic Scene Understanding Through Open Physics-based
Vocabulary,
AICity24(7932-7942)
IEEE DOI
2410
Deep learning, Training, Solid modeling, Vocabulary, Roads, Cameras,
open vocabulary, 3D events, outdoor scene, dynamic caption, 3D-CPP,
natural scene
BibRef
Kong, F.[Fanjie],
Chen, Y.B.[Yan-Bei],
Cai, J.R.[Jia-Rui],
Modolo, D.[Davide],
Hyperbolic Learning with Synthetic Captions for Open-World Detection,
CVPR24(16762-16771)
IEEE DOI
2410
Training, Visualization, Grounding, Noise, Detectors, Object detection,
Hyperbolic Learning, Open-World, Detection, Synthetic Captions
BibRef
Zeng, Z.Q.[Ze-Qun],
Xie, Y.[Yan],
Zhang, H.[Hao],
Chen, C.[Chiyu],
Chen, B.[Bo],
Wang, Z.J.[Zheng-Jue],
MeaCap: Memory-Augmented Zero-shot Image Captioning,
CVPR24(14100-14110)
IEEE DOI Code:
WWW Link.
2410
Measurement, Codes, Accuracy,
Integrated circuit modeling, zero-shot image captioning,
hallucination
BibRef
Wada, Y.[Yuiga],
Kaneda, K.[Kanta],
Saito, D.[Daichi],
Sugiura, K.[Komei],
Polos: Multimodal Metric Learning from Human Feedback for Image
Captioning,
CVPR24(13559-13568)
IEEE DOI
2410
Measurement, Correlation, Computational modeling,
Contrastive learning, Benchmark testing, Feature extraction,
human feedback
BibRef
Huang, X.K.[Xiao-Ke],
Wang, J.F.[Jian-Feng],
Tang, Y.S.[Yan-Song],
Zhang, Z.[Zheng],
Hu, H.[Han],
Lu, J.W.[Ji-Wen],
Wang, L.J.[Li-Juan],
Liu, Z.C.[Zi-Cheng],
Segment and Caption Anything,
CVPR24(13405-13417)
IEEE DOI
2410
Training, Costs, Computational modeling, Semantics,
Memory management, Object detection, Segmentation, Image Captioning
BibRef
Ge, Y.H.[Yun-Hao],
Zeng, X.H.[Xiao-Hui],
Huffman, J.S.[Jacob Samuel],
Lin, T.Y.[Tsung-Yi],
Liu, M.Y.[Ming-Yu],
Cui, Y.[Yin],
Visual Fact Checker: Enabling High-Fidelity Detailed Caption
Generation,
CVPR24(14033-14042)
IEEE DOI
2410
Visualization, Solid modeling, Computational modeling, Pipelines,
Text to image, Object detection, captioning, LLM
BibRef
Ruan, J.[Jie],
Wu, Y.[Yue],
Wan, X.J.[Xiao-Jun],
Zhu, Y.S.[Yue-Sheng],
Describe Images in a Boring Way:
Towards Cross-Modal Sarcasm Generation,
WACV24(5689-5698)
IEEE DOI
2404
Correlation, Codes, Training data, Data mining, Algorithms,
Vision + language and/or other modalities
BibRef
Hirsch, E.[Elad],
Tal, A.[Ayellet],
CLID: Controlled-Length Image Descriptions with Limited Data,
WACV24(5519-5529)
IEEE DOI
2404
Training, Codes, Data models, Algorithms,
Vision + language and/or other modalities
BibRef
Petryk, S.[Suzanne],
Whitehead, S.[Spencer],
Gonzalez, J.E.[Joseph E.],
Darrell, T.J.[Trevor J.],
Rohrbach, A.[Anna],
Rohrbach, M.[Marcus],
Simple Token-Level Confidence Improves Caption Correctness,
WACV24(5730-5740)
IEEE DOI
2404
Aggregates, Training data, Cognition, Data models, Algorithms,
Vision + language and/or other modalities
BibRef
Sabir, A.[Ahmed],
Word to Sentence Visual Semantic Similarity for Caption Generation:
Lessons Learned,
MVA23(1-5)
DOI Link
2403
Visualization, Machine vision, Semantics, Context modeling
BibRef
Verma, A.[Anand],
Agarwal, S.[Saurabh],
Arya, K.V.,
Petrlik, I.[Ivan],
Esparza, R.[Roberto],
Rodriguez, C.[Ciro],
Image Captioning with Reinforcement Learning,
ICCVMI23(1-7)
IEEE DOI
2403
Measurement, Training,
Machine learning algorithms, Reinforcement learning, SPICE, MS COCO
BibRef
Wei, Y.C.[Yi-Chao],
Li, L.[Lin],
Geng, S.L.[Sheng-Ling],
Remote Sensing Image Captioning Using Hire-MLP,
CVIDL23(109-112)
IEEE DOI
2403
Measurement, Deep learning, Visualization, Image recognition,
Computational modeling, Neural networks, Feature extraction
BibRef
Fan, J.[Jiashuo],
Liang, Y.[Yaoyuan],
Liu, L.[Leyao],
Huang, S.[Shaolun],
Zhang, L.[Lei],
RCA-NOC: Relative Contrastive Alignment for Novel Object Captioning,
ICCV23(15464-15474)
IEEE DOI
2401
BibRef
Li, R.[Runjia],
Sun, S.Y.[Shu-Yang],
Elhoseiny, M.[Mohamed],
Torr, P.[Philip],
OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?,
ICCV23(20236-20246)
IEEE DOI
2401
BibRef
Hu, A.[Anwen],
Chen, S.Z.[Shi-Zhe],
Zhang, L.[Liang],
Jin, Q.[Qin],
Explore and Tell: Embodied Visual Captioning in 3D Environments,
ICCV23(2482-2491)
IEEE DOI Code:
WWW Link.
2401
BibRef
Kang, W.[Wooyoung],
Mun, J.[Jonghwan],
Lee, S.J.[Sung-Jun],
Roh, B.[Byungseok],
Noise-aware Learning from Web-crawled Image-Text Data for Image
Captioning,
ICCV23(2930-2940)
IEEE DOI Code:
WWW Link.
2401
BibRef
Fei, J.J.[Jun-Jie],
Wang, T.[Teng],
Zhang, J.[Jinrui],
He, Z.Y.[Zhen-Yu],
Wang, C.J.[Cheng-Jie],
Zheng, F.[Feng],
Transferable Decoding with Visual Entities for Zero-Shot Image
Captioning,
ICCV23(3113-3123)
IEEE DOI Code:
WWW Link.
2401
BibRef
Kornblith, S.[Simon],
Li, L.[Lala],
Wang, Z.[Zirui],
Nguyen, T.[Thao],
Guiding image captioning models toward more specific captions,
ICCV23(15213-15223)
IEEE DOI
2401
BibRef
Kim, Y.[Yeonju],
Kim, J.[Junho],
Lee, B.K.[Byung-Kwan],
Shin, S.[Sebin],
Ro, Y.M.[Yong Man],
Mitigating Dataset Bias in Image Captioning Through Clip
Confounder-Free Captioning Network,
ICIP23(1720-1724)
IEEE DOI Code:
WWW Link.
2312
BibRef
Dessì, R.[Roberto],
Bevilacqua, M.[Michele],
Gualdoni, E.[Eleonora],
Rakotonirina, N.C.[Nathanaël Carraz],
Franzon, F.[Francesca],
Baroni, M.[Marco],
Cross-Domain Image Captioning with Discriminative Finetuning,
CVPR23(6935-6944)
IEEE DOI
2309
BibRef
Vo, D.M.[Duc Minh],
Luong, Q.A.[Quoc-An],
Sugimoto, A.[Akihiro],
Nakayama, H.[Hideki],
A-CAP: Anticipation Captioning with Commonsense Knowledge,
CVPR23(10824-10833)
IEEE DOI
2309
BibRef
Kuo, C.W.[Chia-Wen],
Kira, Z.[Zsolt],
HAAV: Hierarchical Aggregation of Augmented Views for Image
Captioning,
CVPR23(11039-11049)
IEEE DOI
2309
BibRef
Ramos, R.[Rita],
Martins, B.[Bruno],
Elliott, D.[Desmond],
Kementchedjhieva, Y.[Yova],
Smallcap: Lightweight Image Captioning Prompted with Retrieval
Augmentation,
CVPR23(2840-2849)
IEEE DOI
2309
BibRef
Hirota, Y.[Yusuke],
Nakashima, Y.[Yuta],
Garcia, N.[Noa],
Model-Agnostic Gender Debiased Image Captioning,
CVPR23(15191-15200)
IEEE DOI
2309
BibRef
Tran, H.T.T.[Huyen Thi Thanh],
Okatani, T.[Takayuki],
Bright as the Sun: In-depth Analysis of Imagination-driven Image
Captioning,
ACCV22(IV:675-691).
Springer DOI
2307
BibRef
Phueaksri, I.[Itthisak],
Kastner, M.A.[Marc A.],
Kawanishi, Y.[Yasutomo],
Komamizu, T.[Takahiro],
Ide, I.[Ichiro],
Towards Captioning an Image Collection from a Combined Scene Graph
Representation Approach,
MMMod23(I: 178-190).
Springer DOI
2304
BibRef
Zhang, Y.[Youyuan],
Wang, J.[Jiuniu],
Wu, H.[Hao],
Xu, W.J.[Wen-Jia],
Distinctive Image Captioning via Clip Guided Group Optimization,
CMHRI22(223-238).
Springer DOI
2304
BibRef
Qiu, Y.[Yue],
Yamamoto, S.[Shintaro],
Yamada, R.[Ryosuke],
Suzuki, R.[Ryota],
Kataoka, H.[Hirokatsu],
Iwata, K.[Kenji],
Satoh, Y.[Yutaka],
3D Change Localization and Captioning from Dynamic Scans of Indoor
Scenes,
WACV23(1176-1185)
IEEE DOI
2302
Location awareness, Point cloud compression, Image recognition,
Limiting, Detectors, Benchmark testing, 3D computer vision
BibRef
Honda, U.[Ukyo],
Watanabe, T.[Taro],
Matsumoto, Y.[Yuji],
Switching to Discriminative Image Captioning by Relieving a
Bottleneck of Reinforcement Learning,
WACV23(1124-1134)
IEEE DOI
2302
Vocabulary, Limiting, Computational modeling, Switches,
Reinforcement learning, Control systems,
visual reasoning
BibRef
Sui, J.H.[Jia-Hong],
Yu, H.M.[Hui-Min],
Liang, X.Y.[Xin-Yue],
Ping, P.[Ping],
Image Caption Method Based on Graph Attention Network with Global
Context,
ICIVC22(480-487)
IEEE DOI
2301
Deep learning, Visualization, Image coding, Semantics,
Neural networks, Image representation, Feature extraction, global feature
BibRef
Arguello, P.[Paula],
Lopez, J.[Jhon],
Hinojosa, C.[Carlos],
Arguello, H.[Henry],
Optics Lens Design for Privacy-Preserving Scene Captioning,
ICIP22(3551-3555)
IEEE DOI
2211
Integrated optics, Privacy, Optical design, Optical distortion,
Optical detectors, Optical imaging, Feature extraction,
Computational Optics
BibRef
Meng, Z.H.[Zi-Hang],
Yang, D.[David],
Cao, X.F.[Xue-Fei],
Shah, A.[Ashish],
Lim, S.N.[Ser-Nam],
Object-Centric Unsupervised Image Captioning,
ECCV22(XXXVI:219-235).
Springer DOI
2211
BibRef
Wang, Z.[Zhen],
Chen, L.[Long],
Ma, W.B.[Wen-Bo],
Han, G.X.[Guang-Xing],
Niu, Y.[Yulei],
Shao, J.[Jian],
Xiao, J.[Jun],
Explicit Image Caption Editing,
ECCV22(XXXVI:113-129).
Springer DOI
2211
BibRef
Jiao, Y.[Yang],
Chen, S.X.[Shao-Xiang],
Jie, Z.Q.[Ze-Qun],
Chen, J.J.[Jing-Jing],
Ma, L.[Lin],
Jiang, Y.G.[Yu-Gang],
MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes,
ECCV22(XXXV:528-545).
Springer DOI
2211
BibRef
Nagrani, A.[Arsha],
Seo, P.H.[Paul Hongsuck],
Seybold, B.[Bryan],
Hauth, A.[Anja],
Manen, S.[Santiago],
Sun, C.[Chen],
Schmid, C.[Cordelia],
Learning Audio-Video Modalities from Image Captions,
ECCV22(XIV:407-426).
Springer DOI
2211
BibRef
Tewel, Y.[Yoad],
Shalev, Y.[Yoav],
Schwartz, I.[Idan],
Wolf, L.B.[Lior B.],
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
Arithmetic,
CVPR22(17897-17907)
IEEE DOI
2210
Knowledge engineering, Training, Measurement, Visualization,
Text recognition, Semantics, Magnetic heads,
Vision+language, Transfer/low-shot/long-tail learning
BibRef
Truong, P.[Prune],
Danelljan, M.[Martin],
Yu, F.[Fisher],
Van Gool, L.J.[Luc J.],
Probabilistic Warp Consistency for Weakly-Supervised Semantic
Correspondences,
CVPR22(8698-8708)
IEEE DOI
2210
Image resolution, Costs, Semantics, Computer architecture,
Benchmark testing, Probabilistic logic, Motion and tracking, retrieval
BibRef
Chan, D.M.[David M.],
Myers, A.[Austin],
Vijayanarasimhan, S.[Sudheendra],
Ross, D.A.[David A.],
Seybold, B.[Bryan],
Canny, J.F.[John F.],
What's in a Caption? Dataset-Specific Linguistic Diversity and Its
Effect on Visual Description Models and Metrics,
VDU22(4739-4748)
IEEE DOI
2210
Measurement, Visualization, Analytical models, Video description,
Computational modeling, Training data, Linguistics
BibRef
Popattia, M.[Murad],
Rafi, M.[Muhammad],
Qureshi, R.[Rizwan],
Nawaz, S.[Shah],
Guiding Attention using Partial-Order Relationships for Image
Captioning,
MULA22(4670-4679)
IEEE DOI
2210
Training, Measurement, Visualization, Semantics, Computer architecture
BibRef
Mohamed, Y.[Youssef],
Khan, F.F.[Faizan Farooq],
Haydarov, K.[Kilichbek],
Elhoseiny, M.[Mohamed],
It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective
Image Captioning by Contrastive Data Collection,
CVPR22(21231-21240)
IEEE DOI
2210
Measurement, Codes, Human intelligence, Data collection, Data models,
Datasets and evaluation, Others, Vision + language
BibRef
Chen, J.[Jun],
Guo, H.[Han],
Yi, K.[Kai],
Li, B.Y.[Bo-Yang],
Elhoseiny, M.[Mohamed],
VisualGPT: Data-efficient Adaptation of Pretrained Language Models
for Image Captioning,
CVPR22(18009-18019)
IEEE DOI
2210
Training, Representation learning, Adaptation models,
Visualization, Computational modeling, Semantics, Linguistics,
Transfer/low-shot/long-tail learning
BibRef
Chen, S.[Simin],
Song, Z.H.[Zi-He],
Haque, M.[Mirazul],
Liu, C.[Cong],
Yang, W.[Wei],
NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image
Caption Generation Models,
CVPR22(15344-15353)
IEEE DOI
2210
Visualization, Computational modeling, Perturbation methods,
Robustness, Real-time systems,
Efficient learning and inferences
BibRef
Hirota, Y.[Yusuke],
Nakashima, Y.[Yuta],
Garcia, N.[Noa],
Quantifying Societal Bias Amplification in Image Captioning,
CVPR22(13440-13449)
IEEE DOI
2210
Measurement, Equalizers, Computational modeling, Focusing,
Predictive models, Skin, Transparency, fairness, accountability,
Vision + language
BibRef
Beddiar, D.[Djamila],
Oussalah, M.[Mourad],
Tapio, S.[Seppänen],
Explainability for Medical Image Captioning,
IPTA22(1-6)
IEEE DOI
2206
Visualization, Computational modeling, Semantics,
Feature extraction, Decoding, Convolutional neural networks,
Artificial Intelligence Explainability
BibRef
Bounab, Y.[Yazid],
Oussalah, M.[Mourad],
Ferdenache, A.[Ahlam],
Reconciling Image Captioning and User's Comments for Urban Tourism,
IPTA20(1-6)
IEEE DOI
2206
Visualization, Databases, Tourism industry, Pipelines, Tools, Internet,
Planning, Image captioning, social media, image description,
google vision API
BibRef
Zha, Z.W.[Zhi-Wei],
Zhou, P.F.[Peng-Fei],
Bai, C.[Cong],
Exploring Implicit and Explicit Relations with the Dual Relation-Aware
Network for Image Captioning,
MMMod22(II:97-108).
Springer DOI
2203
BibRef
Ruta, D.[Dan],
Motiian, S.[Saeid],
Faieta, B.[Baldo],
Lin, Z.[Zhe],
Jin, H.L.[Hai-Lin],
Filipkowski, A.[Alex],
Gilbert, A.[Andrew],
Collomosse, J.[John],
ALADIN: All Layer Adaptive Instance Normalization for Fine-grained
Style Similarity,
ICCV21(11906-11915)
IEEE DOI
2203
Training, Representation learning, Visualization,
Adaptation models, User-generated content, Neural generative models
BibRef
Nguyen, K.[Kien],
Tripathi, S.[Subarna],
Du, B.[Bang],
Guha, T.[Tanaya],
Nguyen, T.Q.[Truong Q.],
In Defense of Scene Graphs for Image Captioning,
ICCV21(1387-1396)
IEEE DOI
2203
Convolutional codes, Visualization, Image coding, Semantics,
Pipelines, Generators, Vision + language, Scene analysis and understanding
BibRef
Shi, J.[Jiahe],
Li, Y.[Yali],
Wang, S.J.[Sheng-Jin],
Partial Off-policy Learning: Balance Accuracy and Diversity for
Human-Oriented Image Captioning,
ICCV21(2167-2176)
IEEE DOI
2203
Correlation, Computational modeling, Reinforcement learning,
Generative adversarial networks, Task analysis,
BibRef
Alahmadi, R.[Rehab],
Hahn, J.[James],
Improve Image Captioning by Estimating the Gazing Patterns from the
Caption,
WACV22(2453-2462)
IEEE DOI
2202
Visualization, Computational modeling,
Neural networks, Feature extraction,
Vision and Languages Scene Understanding
BibRef
Biten, A.F.[Ali Furkan],
Gómez, L.[Lluís],
Karatzas, D.[Dimosthenis],
Let there be a clock on the beach:
Reducing Object Hallucination in Image Captioning,
WACV22(2473-2482)
IEEE DOI
2202
Measurement, Training, Visualization, Analytical models,
Computational modeling, Training data, Vision and Languages
BibRef
Deb, T.[Tonmoay],
Sadmanee, A.[Akib],
Bhaumik, K.K.[Kishor Kumar],
Ali, A.A.[Amin Ahsan],
Amin, M.A.[M Ashraful],
Rahman, A.K.M.M.[A.K.M. Mahbubur],
Variational Stacked Local Attention Networks for Diverse Video
Captioning,
WACV22(2493-2502)
IEEE DOI
2202
Measurement, Visualization, Stacking, Redundancy, Natural languages,
Streaming media, Syntactics, Vision and Languages Datasets,
Analysis and Understanding
BibRef
Sharif, N.[Naeha],
White, L.[Lyndon],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
WEmbSim: A Simple yet Effective Metric for Image Captioning,
DICTA20(1-8)
IEEE DOI
2201
Measurement, Correlation, Databases, Digital images,
Machine learning, SPICE, Task analysis, Image Captioning, Word Embeddings
BibRef
Qiu, J.Y.[Jia-Yan],
Yang, Y.D.[Yi-Ding],
Wang, X.[Xinchao],
Tao, D.C.[Da-Cheng],
Scene Essence,
CVPR21(8318-8329)
IEEE DOI
2111
Image recognition, Graph neural networks,
Labeling, Lenses
BibRef
Hosseinzadeh, M.[Mehrdad],
Wang, Y.[Yang],
Image Change Captioning by Learning from an Auxiliary Task,
CVPR21(2724-2733)
IEEE DOI
2111
Training, Image color analysis, Image retrieval,
Semantics, Benchmark testing
BibRef
Chen, L.[Long],
Jiang, Z.H.[Zhi-Hong],
Xiao, J.[Jun],
Liu, W.[Wei],
Human-like Controllable Image Captioning with Verb-specific Semantic
Roles,
CVPR21(16841-16851)
IEEE DOI
2111
Visualization, Codes, Semantics, Benchmark testing,
Controllability
BibRef
Chen, D.Z.Y.[Dave Zhen-Yu],
Gholami, A.[Ali],
Nießner, M.[Matthias],
Chang, A.X.[Angel X.],
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans,
CVPR21(3192-3202)
IEEE DOI
2111
Location awareness, Message passing,
Natural languages, Pipelines, Computer architecture, Object detection
BibRef
Luong, Q.A.[Quoc-An],
Vo, D.M.[Duc Minh],
Sugimoto, A.[Akihiro],
Saliency based Subject Selection for Diverse Image Captioning,
MVA21(1-5)
DOI Link
2109
Measurement, Visualization, Diversity methods
BibRef
Sharif, N.[Naeha],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
SubICap: Towards Subword-informed Image Captioning,
WACV21(3539-3540)
IEEE DOI
2106
Measurement, Training, Vocabulary, Image segmentation,
Image color analysis, Computational modeling, Semantics
BibRef
Umemura, K.[Kazuki],
Kastner, M.A.[Marc A.],
Ide, I.[Ichiro],
Kawanishi, Y.[Yasutomo],
Hirayama, T.[Takatsugu],
Doman, K.[Keisuke],
Deguchi, D.[Daisuke],
Murase, H.[Hiroshi],
Tell as You Imagine: Sentence Imageability-aware Image Captioning,
MMMod21(II:62-73).
Springer DOI
2106
BibRef
Hallonquist, N.[Neil],
German, D.[Donald],
Younes, L.[Laurent],
Graph Discovery for Visual Test Generation,
ICPR21(7500-7507)
IEEE DOI
2105
Visualization, Vocabulary, Machine vision, Semantics,
Image representation, Knowledge discovery, Probability distribution
BibRef
Li, X.J.[Xin-Jie],
Yang, C.[Chun],
Chen, S.L.[Song-Lu],
Zhu, C.[Chao],
Yin, X.C.[Xu-Cheng],
Semantic Bilinear Pooling for Fine-Grained Recognition,
ICPR21(3660-3666)
IEEE DOI
2105
Training, Deep learning, Semantics, Birds,
Testing, Semantic Information, Bilinear Pooling, Fine-Grained Recognition
BibRef
Chavhan, R.[Ruchika],
Banerjee, B.[Biplab],
Zhu, X.X.[Xiao Xiang],
Chaudhuri, S.[Subhasis],
A Novel Actor Dual-Critic Model for Remote Sensing Image Captioning,
ICPR21(4918-4925)
IEEE DOI
2105
Training, Image coding, Reinforcement learning, Gain measurement,
Benchmark testing, Optical imaging, Data models
BibRef
Kalimuthu, M.[Marimuthu],
Mogadala, A.[Aditya],
Mosbach, M.[Marius],
Klakow, D.[Dietrich],
Fusion Models for Improved Image Captioning,
MMDLCA20(381-395).
Springer DOI
2103
BibRef
Cetinic, E.[Eva],
Iconographic Image Captioning for Artworks,
FAPER20(502-516).
Springer DOI
2103
BibRef
Huang, Y.Q.[Yi-Qing],
Chen, J.S.[Jian-Sheng],
Show, Conceive and Tell: Image Captioning with Prospective Linguistic
Information,
ACCV20(VI:478-494).
Springer DOI
2103
BibRef
Deng, C.R.[Chao-Rui],
Ding, N.[Ning],
Tan, M.K.[Ming-Kui],
Wu, Q.[Qi],
Length-controllable Image Captioning,
ECCV20(XIII:712-729).
Springer DOI
2011
BibRef
Gurari, D.[Danna],
Zhao, Y.N.[Yi-Nan],
Zhang, M.[Meng],
Bhattacharya, N.[Nilavra],
Captioning Images Taken by People Who Are Blind,
ECCV20(XVII:417-434).
Springer DOI
2011
BibRef
Zhong, Y.W.[Yi-Wu],
Wang, L.W.[Li-Wei],
Chen, J.S.[Jian-Shu],
Yu, D.[Dong],
Li, Y.[Yin],
Comprehensive Image Captioning via Scene Graph Decomposition,
ECCV20(XIV:211-229).
Springer DOI
2011
BibRef
Wang, Z.[Zeyu],
Feng, B.[Berthy],
Narasimhan, K.[Karthik],
Russakovsky, O.[Olga],
Towards Unique and Informative Captioning of Images,
ECCV20(VII:629-644).
Springer DOI
2011
BibRef
Sidorov, O.[Oleksii],
Hu, R.H.[Rong-Hang],
Rohrbach, M.[Marcus],
Singh, A.[Amanpreet],
Textcaps: A Dataset for Image Captioning with Reading Comprehension,
ECCV20(II:742-758).
Springer DOI
2011
BibRef
Durand, T.[Thibaut],
Learning User Representations for Open Vocabulary Image Hashtag
Prediction,
CVPR20(9766-9775)
IEEE DOI
2008
Tagging, Twitter, Computational modeling, Vocabulary,
Predictive models, History, Visualization
BibRef
Prabhudesai, M.[Mihir],
Tung, H.Y.F.[Hsiao-Yu Fish],
Javed, S.A.[Syed Ashar],
Sieb, M.[Maximilian],
Harley, A.W.[Adam W.],
Fragkiadaki, K.[Katerina],
Embodied Language Grounding With 3D Visual Feature Representations,
CVPR20(2217-2226)
IEEE DOI
2008
Associating language utterances to 3D visual abstractions.
Visualization,
Cameras, Feature extraction, Detectors, Solid modeling
BibRef
Li, Z.,
Tran, Q.,
Mai, L.,
Lin, Z.,
Yuille, A.L.,
Context-Aware Group Captioning via Self-Attention and Contrastive
Features,
CVPR20(3437-3447)
IEEE DOI
2008
Task analysis, Visualization, Context modeling,
Training, Natural languages, Computational modeling
BibRef
Zhou, Y.,
Wang, M.,
Liu, D.,
Hu, Z.,
Zhang, H.,
More Grounded Image Captioning by Distilling Image-Text Matching
Model,
CVPR20(4776-4785)
IEEE DOI
2008
Visualization, Grounding, Task analysis, Training, Measurement,
Computational modeling, Image edge detection
BibRef
Sammani, F.,
Melas-Kyriazi, L.,
Show, Edit and Tell: A Framework for Editing Image Captions,
CVPR20(4807-4815)
IEEE DOI
2008
Decoding, Visualization, Task analysis, Logic gates,
Natural languages, Adaptation models, Glass
BibRef
Chen, S.,
Jin, Q.,
Wang, P.,
Wu, Q.,
Say As You Wish: Fine-Grained Control of Image Caption Generation
With Abstract Scene Graphs,
CVPR20(9959-9968)
IEEE DOI
2008
Semantics, Decoding, Visualization, Feature extraction,
Controllability, Task analysis, Measurement
BibRef
Guo, L.,
Liu, J.,
Zhu, X.,
Yao, P.,
Lu, S.,
Lu, H.,
Normalized and Geometry-Aware Self-Attention Network for Image
Captioning,
CVPR20(10324-10333)
IEEE DOI
2008
Geometry, Task analysis, Visualization, Decoding, Training,
Feature extraction, Computer architecture
BibRef
Chen, J.,
Jin, Q.,
Better Captioning With Sequence-Level Exploration,
CVPR20(10887-10896)
IEEE DOI
2008
Task analysis, Measurement, Training, Computational modeling,
Computer architecture, Portable computers, Decoding
BibRef
Pan, Y.,
Yao, T.,
Li, Y.,
Mei, T.,
X-Linear Attention Networks for Image Captioning,
CVPR20(10968-10977)
IEEE DOI
2008
Visualization, Decoding, Cognition, Knowledge discovery,
Task analysis, Aggregates, Weight measurement
BibRef
Park, G.[Geondo],
Han, C.[Chihye],
Kim, D.[Daeshik],
Yoon, W.J.[Won-Jun],
MHSAN: Multi-Head Self-Attention Network for Visual Semantic
Embedding,
WACV20(1507-1515)
IEEE DOI
2006
Feature extraction, Visualization, Semantics, Task analysis,
Recurrent neural networks, Image representation, Image coding
BibRef
Chen, C.,
Zhang, R.,
Koh, E.,
Kim, S.,
Cohen, S.,
Rossi, R.,
Figure Captioning with Relation Maps for Reasoning,
WACV20(1526-1534)
IEEE DOI
2006
Bars, Training, Visualization, Decoding, Computational modeling,
Task analysis, Portable document format
BibRef
He, S.,
Tavakoli, H.R.,
Borji, A.,
Pugeault, N.,
Human Attention in Image Captioning: Dataset and Analysis,
ICCV19(8528-8537)
IEEE DOI
2004
Code, Captioning.
WWW Link. convolutional neural nets, image segmentation,
natural language processing, object detection, visual perception,
Adaptation models
BibRef
Huang, L.,
Wang, W.,
Chen, J.,
Wei, X.,
Attention on Attention for Image Captioning,
ICCV19(4633-4642)
IEEE DOI
2004
Code, Captioning.
WWW Link. decoding, encoding, image processing, natural language processing,
element-wise multiplication, image captioning, weighted average,
Testing
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Hierarchy Parsing for Image Captioning,
ICCV19(2621-2629)
IEEE DOI
2004
convolutional neural nets, feature extraction, image coding,
image representation, image segmentation, Image segmentation
BibRef
Liu, L.,
Tang, J.,
Wan, X.,
Guo, Z.,
Generating Diverse and Descriptive Image Captions Using Visual
Paraphrases,
ICCV19(4239-4248)
IEEE DOI
2004
image classification,
learning (artificial intelligence), Machine learning
BibRef
Ke, L.,
Pei, W.,
Li, R.,
Shen, X.,
Tai, Y.,
Reflective Decoding Network for Image Captioning,
ICCV19(8887-8896)
IEEE DOI
2004
decoding, encoding, feature extraction,
learning (artificial intelligence), Random access memory
BibRef
Vered, G.,
Oren, G.,
Atzmon, Y.,
Chechik, G.,
Joint Optimization for Cooperative Image Captioning,
ICCV19(8897-8906)
IEEE DOI
2004
gradient methods, image sampling, natural language processing,
stochastic programming, text analysis, Loss measurement
BibRef
Ge, H.,
Yan, Z.,
Zhang, K.,
Zhao, M.,
Sun, L.,
Exploring Overall Contextual Information for Image Captioning in
Human-Like Cognitive Style,
ICCV19(1754-1763)
IEEE DOI
2004
cognition, computational linguistics,
learning (artificial intelligence), Cognition
BibRef
Agrawal, H.,
Desai, K.,
Wang, Y.,
Chen, X.,
Jain, R.,
Johnson, M.,
Batra, D.,
Parikh, D.,
Lee, S.,
Anderson, P.,
nocaps: novel object captioning at scale,
ICCV19(8947-8956)
IEEE DOI
2004
feature extraction,
learning (artificial intelligence), object detection, Vegetation
BibRef
Nguyen, A.,
Tran, Q.D.,
Do, T.,
Reid, I.,
Caldwell, D.G.,
Tsagarakis, N.G.,
Object Captioning and Retrieval with Natural Language,
ACVR19(2584-2592)
IEEE DOI
2004
convolutional neural nets, image retrieval,
learning (artificial intelligence), vision and language
BibRef
Gu, J.,
Joty, S.,
Cai, J.,
Zhao, H.,
Yang, X.,
Wang, G.,
Unpaired Image Captioning via Scene Graph Alignments,
ICCV19(10322-10331)
IEEE DOI
2004
graph theory, image representation, image retrieval,
natural language processing, text analysis, Encoding
BibRef
Shen, T.,
Kar, A.,
Fidler, S.,
Learning to Caption Images Through a Lifetime by Asking Questions,
ICCV19(10392-10401)
IEEE DOI
2004
image retrieval, multi-agent systems,
natural language processing, Automobiles
BibRef
Aneja, J.[Jyoti],
Agrawal, H.[Harsh],
Batra, D.[Dhruv],
Schwing, A.G.[Alexander G.],
Sequential Latent Spaces for Modeling the Intention During Diverse
Image Captioning,
ICCV19(4260-4269)
IEEE DOI
2004
image retrieval, image segmentation,
learning (artificial intelligence), recurrent neural nets, Controllability
BibRef
Deshpande, A.[Aditya],
Aneja, J.[Jyoti],
Wang, L.W.[Li-Wei],
Schwing, A.G.[Alexander G.],
Forsyth, D.A.[David A.],
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech,
CVPR19(10687-10696).
IEEE DOI
2002
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Image Captioning Based on Visual and Semantic Attention,
MMMod20(I:151-162).
Springer DOI
2003
BibRef
Dognin, P.[Pierre],
Melnyk, I.[Igor],
Mroueh, Y.[Youssef],
Ross, J.[Jerret],
Sercu, T.[Tom],
Adversarial Semantic Alignment for Improved Image Captions,
CVPR19(10455-10463).
IEEE DOI
2002
BibRef
Fukui, H.[Hiroshi],
Hirakawa, T.[Tsubasa],
Yamashita, T.[Takayoshi],
Fujiyoshi, H.[Hironobu],
Attention Branch Network: Learning of Attention Mechanism for Visual
Explanation,
CVPR19(10697-10706).
IEEE DOI
2002
BibRef
Biten, A.F.[Ali Furkan],
Gomez, L.[Lluis],
Rusinol, M.[Marcal],
Karatzas, D.[Dimosthenis],
Good News, Everyone! Context Driven Entity-Aware Captioning for News
Images,
CVPR19(12458-12467).
IEEE DOI
2002
BibRef
Surís, D.[Dídac],
Epstein, D.[Dave],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
Vondrick, C.[Carl],
Learning to Learn Words from Visual Scenes,
ECCV20(XXIX: 434-452).
Springer DOI
2010
BibRef
Shuster, K.[Kurt],
Humeau, S.[Samuel],
Hu, H.[Hexiang],
Bordes, A.[Antoine],
Weston, J.[Jason],
Engaging Image Captioning via Personality,
CVPR19(12508-12518).
IEEE DOI
2002
BibRef
Feng, Y.[Yang],
Ma, L.[Lin],
Liu, W.[Wei],
Luo, J.B.[Jie-Bo],
Unsupervised Image Captioning,
CVPR19(4120-4129).
IEEE DOI
2002
BibRef
Xu, Y.[Yan],
Wu, B.Y.[Bao-Yuan],
Shen, F.M.[Fu-Min],
Fan, Y.B.[Yan-Bo],
Zhang, Y.[Yong],
Shen, H.T.[Heng Tao],
Liu, W.[Wei],
Exact Adversarial Attack to Image Captioning via Structured Output
Learning With Latent Variables,
CVPR19(4130-4139).
IEEE DOI
2002
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Describing Like Humans: On Diversity in Image Captioning,
CVPR19(4190-4198).
IEEE DOI
2002
BibRef
Guo, L.T.[Long-Teng],
Liu, J.[Jing],
Yao, P.[Peng],
Li, J.W.[Jiang-Wei],
Lu, H.Q.[Han-Qing],
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text,
CVPR19(4199-4208).
IEEE DOI
2002
BibRef
Zhang, L.[Lu],
Zhang, J.M.[Jian-Ming],
Lin, Z.[Zhe],
Lu, H.C.[Hu-Chuan],
He, Y.[You],
CapSal: Leveraging Captioning to Boost Semantics for Salient Object
Detection,
CVPR19(6017-6026).
IEEE DOI
2002
BibRef
Yin, G.J.[Guo-Jun],
Sheng, L.[Lu],
Liu, B.[Bin],
Yu, N.H.[Neng-Hai],
Wang, X.G.[Xiao-Gang],
Shao, J.[Jing],
Context and Attribute Grounded Dense Captioning,
CVPR19(6234-6243).
IEEE DOI
2002
BibRef
Gao, J.L.[Jun-Long],
Wang, S.Q.[Shi-Qi],
Wang, S.S.[Shan-She],
Ma, S.W.[Si-Wei],
Gao, W.[Wen],
Self-Critical N-Step Training for Image Captioning,
CVPR19(6293-6301).
IEEE DOI
2002
BibRef
Qin, Y.[Yu],
Du, J.J.[Jia-Jun],
Zhang, Y.H.[Yong-Hua],
Lu, H.T.[Hong-Tao],
Look Back and Predict Forward in Image Captioning,
CVPR19(8359-8367).
IEEE DOI
2002
BibRef
Zheng, Y.[Yue],
Li, Y.[Yali],
Wang, S.J.[Sheng-Jin],
Intention Oriented Image Captions With Guiding Objects,
CVPR19(8387-8396).
IEEE DOI
2002
BibRef
Huang, Y.,
Li, C.,
Li, T.,
Wan, W.,
Chen, J.,
Image Captioning with Attribute Refinement,
ICIP19(1820-1824)
IEEE DOI
1910
Image captioning, attribute recognition, Semantic attention,
Deep Neural Network, Conditional Random Field
BibRef
Lee, J.,
Lee, Y.,
Seong, S.,
Kim, K.,
Kim, S.,
Kim, J.,
Capturing Long-Range Dependencies in Video Captioning,
ICIP19(1880-1884)
IEEE DOI
1910
Video captioning, non-local block, long short-term memory,
long-range dependency, video representation
BibRef
Shi, J.,
Li, Y.,
Wang, S.,
Cascade Attention: Multiple Feature Based Learning for Image
Captioning,
ICIP19(1970-1974)
IEEE DOI
1910
Image Captioning, Attention Mechanism, Cascade Attention
BibRef
Wang, Y.,
Shen, Y.,
Xiong, H.,
Lin, W.,
Adaptive Hard Example Mining for Image Captioning,
ICIP19(3342-3346)
IEEE DOI
1910
Reinforcement Learning, Image Captioning
BibRef
Xiao, H.,
Shi, J.,
A Novel Attribute Selection Mechanism for Video Captioning,
ICIP19(619-623)
IEEE DOI
1910
Attributes, Video captioning, Attention, Reinforcement learning
BibRef
Lim, J.H.,
Chan, C.S.,
Mask Captioning Network,
ICIP19(1-5)
IEEE DOI
1910
Image captioning, Deep learning, Scene understanding
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Gated Hierarchical Attention for Image Captioning,
ACCV18(IV:21-37).
Springer DOI
1906
BibRef
Wang, W.X.[Wei-Xuan],
Chen, Z.H.[Zhi-Hong],
Hu, H.F.[Hai-Feng],
Multivariate Attention Network for Image Captioning,
ACCV18(VI:587-602).
Springer DOI
1906
BibRef
Ghanimifard, M.[Mehdi],
Dobnik, S.[Simon],
Knowing When to Look for What and Where: Evaluating Generation of
Spatial Descriptions with Adaptive Attention,
VL18(IV:153-161).
Springer DOI
1905
See also Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning.
BibRef
Kim, B.[Boeun],
Lee, Y.H.[Young Han],
Jung, H.[Hyedong],
Cho, C.[Choongsang],
Distinctive-Attribute Extraction for Image Captioning,
VL18(IV:133-144).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Muscat, A.[Adrian],
Pre-gen Metrics: Predicting Caption Quality Metrics Without Generating
Captions,
VL18(IV:114-123).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Camilleri, K.P.[Kenneth P.],
Quantifying the Amount of Visual Information Used by Neural Caption
Generators,
VL18(IV:124-132).
Springer DOI
1905
BibRef
Ren, L.,
Qi, G.,
Hua, K.,
Improving Diversity of Image Captioning Through Variational
Autoencoders and Adversarial Learning,
WACV19(263-272)
IEEE DOI
1904
image classification, image coding,
image segmentation, learning (artificial intelligence),
Maximum likelihood estimation
BibRef
Zhou, Y.,
Sun, Y.,
Honavar, V.,
Improving Image Captioning by Leveraging Knowledge Graphs,
WACV19(283-293)
IEEE DOI
1904
graph theory, image capture, image retrieval,
performance measure, image captioning systems, knowledge graphs,
Generators
BibRef
Lu, J.S.[Jia-Sen],
Yang, J.W.[Jian-Wei],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Neural Baby Talk,
CVPR18(7219-7228)
IEEE DOI
1812
Detectors, Visualization, Grounding, Pediatrics, Natural languages,
Dogs, Task analysis
BibRef
Khademi, M.,
Schulte, O.,
Image Caption Generation with Hierarchical Contextual Visual Spatial
Attention,
Cognitive18(2024-20248)
IEEE DOI
1812
Feature extraction, Visualization, Logic gates,
Computer architecture, Task analysis, Context modeling, Computational modeling
BibRef
Yan, S.,
Wu, F.,
Smith, J.S.,
Lu, W.,
Zhang, B.,
Image Captioning using Adversarial Networks and Reinforcement
Learning,
ICPR18(248-253)
IEEE DOI
1812
Generators, Generative adversarial networks,
Monte Carlo methods, Maximum likelihood estimation,
Task analysis
BibRef
Wang, F.,
Gong, X.,
Huang, L.,
Time-Dependent Pre-attention Model for Image Captioning,
ICPR18(3297-3302)
IEEE DOI
1812
Decoding, Task analysis, Semantics, Visualization,
Feature extraction, Computational modeling, Computer science
BibRef
Luo, R.,
Shakhnarovich, G.,
Cohen, S.,
Price, B.,
Discriminability Objective for Training Descriptive Captions,
CVPR18(6964-6974)
IEEE DOI
1812
Training, Task analysis, Visualization, Measurement,
Computational modeling, Generators, Airplanes
BibRef
Cui, Y.,
Yang, G.,
Veit, A.,
Huang, X.,
Belongie, S.,
Learning to Evaluate Image Captioning,
CVPR18(5804-5812)
IEEE DOI
1812
Measurement, Pathology, Training, Correlation, SPICE, Robustness, Task analysis
BibRef
Aneja, J.,
Deshpande, A.,
Schwing, A.G.,
Convolutional Image Captioning,
CVPR18(5561-5570)
IEEE DOI
1812
Training, Computer architecture, Task analysis,
Hidden Markov models, Microprocessors, Computational modeling, Indexing
BibRef
Chen, F.,
Ji, R.,
Sun, X.,
Wu, Y.,
Su, J.,
GroupCap: Group-Based Image Captioning with Structured Relevance and
Diversity Constraints,
CVPR18(1345-1353)
IEEE DOI
1812
Visualization, Correlation, Semantics, Feature extraction, Training,
Adaptation models, Task analysis
BibRef
Chen, X.,
Ma, L.,
Jiang, W.,
Yao, J.,
Liu, W.,
Regularizing RNNs for Caption Generation by Reconstructing the Past
with the Present,
CVPR18(7995-8003)
IEEE DOI
1812
Pattern recognition
BibRef
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Li, Y.[Yehao],
Mei, T.[Tao],
Exploring Visual Relationship for Image Captioning,
ECCV18(XIV: 711-727).
Springer DOI
1810
BibRef
Shah, S.A.A.[Syed Afaq Ali],
NNEval: Neural Network Based Evaluation Metric for Image Captioning,
ECCV18(VIII: 39-55).
Springer DOI
1810
BibRef
Jiang, W.H.[Wen-Hao],
Ma, L.[Lin],
Jiang, Y.G.[Yu-Gang],
Liu, W.[Wei],
Zhang, T.[Tong],
Recurrent Fusion Network for Image Captioning,
ECCV18(II: 510-526).
Springer DOI
1810
BibRef
Chatterjee, M.[Moitreya],
Schwing, A.G.[Alexander G.],
Diverse and Coherent Paragraph Generation from Images,
ECCV18(II: 747-763).
Springer DOI
1810
BibRef
Chen, S.[Shi],
Zhao, Q.[Qi],
Boosted Attention: Leveraging Human Attention for Image Captioning,
ECCV18(XI: 72-88).
Springer DOI
1810
BibRef
Dai, B.[Bo],
Ye, D.[Deming],
Lin, D.[Dahua],
Rethinking the Form of Latent States in Image Captioning,
ECCV18(VI: 294-310).
Springer DOI
1810
BibRef
Liu, X.H.[Xi-Hui],
Li, H.S.[Hong-Sheng],
Shao, J.[Jing],
Chen, D.P.[Da-Peng],
Wang, X.G.[Xiao-Gang],
Show, Tell and Discriminate:
Image Captioning by Self-retrieval with Partially Labeled Data,
ECCV18(XV: 353-369).
Springer DOI
1810
BibRef
Fang, F.,
Wang, H.,
Tang, P.,
Image Captioning with Word Level Attention,
ICIP18(1278-1282)
IEEE DOI
1809
Visualization, Feature extraction, Task analysis, Training,
Recurrent neural networks, Semantics, Computational modeling,
bidirectional spatial embedding
BibRef
Zhu, Z.,
Xue, Z.,
Yuan, Z.,
Topic-Guided Attention for Image Captioning,
ICIP18(2615-2619)
IEEE DOI
1809
Visualization, Semantics, Feature extraction, Training, Decoding,
Generators, Measurement, Image captioning, Attention, Topic, Attribute,
Deep Neural Network
BibRef
Gomez-Garay, A.[Alejandro],
Raducanu, B.[Bogdan],
Salas, J.[Joaquín],
Dense Captioning of Natural Scenes in Spanish,
MCPR18(145-154).
Springer DOI
1807
BibRef
Yao, L.[Li],
Ballas, N.[Nicolas],
Cho, K.[Kyunghyun],
Smith, J.[John],
Bengio, Y.[Yoshua],
Oracle Performance for Visual Captioning,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Dong, H.[Hao],
Zhang, J.Q.[Jing-Qing],
McIlwraith, D.[Douglas],
Guo, Y.[Yike],
I2T2I: Learning text to image synthesis with textual data
augmentation,
ICIP17(2015-2019)
IEEE DOI
1803
Birds, Generators, Image generation,
Recurrent neural networks, Shape, Training, Deep learning, GAN, Image Synthesis
BibRef
Jia, Y.H.[Yu-Hua],
Bai, L.[Liang],
Wang, P.[Peng],
Guo, J.L.[Jin-Lin],
Xie, Y.X.[Yu-Xiang],
Deep Convolutional Neural Network for Correlating Images and Sentences,
MMMod18(I:154-165).
Springer DOI
1802
BibRef
Liu, J.Y.[Jing-Yu],
Wang, L.[Liang],
Yang, M.H.[Ming-Hsuan],
Referring Expression Generation and Comprehension via Attributes,
ICCV17(4866-4874)
IEEE DOI
1802
Language Descriptions for objects.
learning (artificial intelligence), object detection, RefCOCO,
RefCOCO+, RefCOCOg, attribute learning model, common space model,
Visualization
BibRef
Dai, B.,
Fidler, S.,
Urtasun, R.,
Lin, D.,
Towards Diverse and Natural Image Descriptions via a Conditional GAN,
ICCV17(2989-2998)
IEEE DOI
1802
image retrieval, image sequences, inference mechanisms,
learning (artificial intelligence),
Visualization
BibRef
Liang, X.,
Hu, Z.,
Zhang, H.,
Gan, C.,
Xing, E.P.,
Recurrent Topic-Transition GAN for Visual Paragraph Generation,
ICCV17(3382-3391)
IEEE DOI
1802
document image processing, inference mechanisms, natural scenes,
recurrent neural nets, text analysis, RTT-GAN,
Visualization
BibRef
Shetty, R.,
Rohrbach, M.,
Hendricks, L.A.,
Fritz, M.,
Schiele, B.,
Speaking the Same Language:
Matching Machine to Human Captions by Adversarial Training,
ICCV17(4155-4164)
IEEE DOI
1802
image matching, learning (artificial intelligence),
sampling methods, vocabulary, adversarial training,
Visualization
BibRef
Liu, S.,
Zhu, Z.,
Ye, N.,
Guadarrama, S.,
Murphy, K.,
Improved Image Captioning via Policy Gradient optimization of SPIDEr,
ICCV17(873-881)
IEEE DOI
1802
Maximum likelihood estimation, Measurement, Mixers, Robustness,
SPICE, Training
BibRef
Gu, J.X.[Jiu-Xiang],
Joty, S.[Shafiq],
Cai, J.F.[Jian-Fei],
Wang, G.[Gang],
Unpaired Image Captioning by Language Pivoting,
ECCV18(I: 519-535).
Springer DOI
1810
BibRef
Gu, J.X.[Jiu-Xiang],
Wang, G.[Gang],
Cai, J.F.[Jian-Fei],
Chen, T.H.[Tsu-Han],
An Empirical Study of Language CNN for Image Captioning,
ICCV17(1231-1240)
IEEE DOI
1802
convolution, learning (artificial intelligence),
natural language processing, recurrent neural nets,
Recurrent neural networks
BibRef
Pedersoli, M.,
Lucas, T.,
Schmid, C.,
Verbeek, J.,
Areas of Attention for Image Captioning,
ICCV17(1251-1259)
IEEE DOI
1802
image segmentation, inference mechanisms,
natural language processing, object detection,
Visualization
BibRef
Zhang, Z.,
Wu, J.J.,
Li, Q.,
Huang, Z.,
Traer, J.,
McDermott, J.H.,
Tenenbaum, J.B.,
Freeman, W.T.,
Generative Modeling of Audible Shapes for Object Perception,
ICCV17(1260-1269)
IEEE DOI
1802
audio recording, audio signal processing, audio-visual systems,
feature extraction, inference mechanisms, interactive systems,
Visualization
BibRef
Liu, Z.J.[Zhi-Jian],
Freeman, W.T.[William T.],
Tenenbaum, J.B.[Joshua B.],
Wu, J.J.[Jia-Jun],
Physical Primitive Decomposition,
ECCV18(XII: 3-20).
Springer DOI
1810
BibRef
Wu, J.J.[Jia-Jun],
Lim, J.[Joseph],
Zhang, H.Y.[Hong-Yi],
Tenenbaum, J.B.[Joshua B.],
Freeman, W.T.[William T.],
Physics 101: Learning Physical Object Properties from Unlabeled Videos,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Tavakoliy, H.R.,
Shetty, R.,
Borji, A.,
Laaksonen, J.,
Paying Attention to Descriptions Generated by Image Captioning Models,
ICCV17(2506-2515)
IEEE DOI
1802
feature extraction, image processing, human descriptions,
human-written descriptions, image captioning model,
Visualization
BibRef
Krause, J.[Jonathan],
Johnson, J.[Justin],
Krishna, R.[Ranjay],
Fei-Fei, L.[Li],
A Hierarchical Approach for Generating Descriptive Image Paragraphs,
CVPR17(3337-3345)
IEEE DOI
1711
Feature extraction, Natural languages, Pragmatics,
Recurrent neural networks, Speech, Visualization
BibRef
Vedantam, R.,
Bengio, S.,
Murphy, K.,
Parikh, D.,
Chechik, G.,
Context-Aware Captions from Context-Agnostic Supervision,
CVPR17(1070-1079)
IEEE DOI
1711
Birds, Cats, Cognition, Context modeling, Pragmatics, Training
BibRef
Gan, Z.,
Gan, C.,
He, X.,
Pu, Y.,
Tran, K.,
Gao, J.,
Carin, L.,
Deng, L.,
Semantic Compositional Networks for Visual Captioning,
CVPR17(1141-1150)
IEEE DOI
1711
Feature extraction, Mouth, Pediatrics, Semantics, Tensile stress,
Training, Visualization
BibRef
Ren, Z.,
Wang, X.,
Zhang, N.,
Lv, X.,
Li, L.J.,
Deep Reinforcement Learning-Based Image Captioning with Embedding
Reward,
CVPR17(1151-1159)
IEEE DOI
1711
Decision making, Learning (artificial intelligence), Measurement,
Neural networks, Training, Visualization
BibRef
Rennie, S.J.,
Marcheret, E.,
Mroueh, Y.,
Ross, J.,
Goel, V.,
Self-Critical Sequence Training for Image Captioning,
CVPR17(1179-1195)
IEEE DOI
1711
Inference algorithms, Learning (artificial intelligence),
Logic gates, Measurement, Predictive models, Training
BibRef
Yang, L.,
Tang, K.,
Yang, J.,
Li, L.J.,
Dense Captioning with Joint Inference and Visual Context,
CVPR17(1978-1987)
IEEE DOI
1711
Bioinformatics, Genomics, Object detection, Proposals, Semantics,
Training, Visualization
BibRef
Lu, J.,
Xiong, C.,
Parikh, D.,
Socher, R.,
Knowing When to Look: Adaptive Attention via a Visual Sentinel for
Image Captioning,
CVPR17(3242-3250)
IEEE DOI
1711
Adaptation models, Computational modeling, Context modeling,
Decoding, Logic gates, Mathematical model, Visualization
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Incorporating Copying Mechanism in Image Captioning for Learning
Novel Objects,
CVPR17(5263-5271)
IEEE DOI
1711
Decoding, Hidden Markov models, Object recognition,
Recurrent neural networks, Standards, Training, Visualization
BibRef
Chen, L.,
Zhang, H.,
Xiao, J.,
Nie, L.,
Shao, J.,
Liu, W.,
Chua, T.S.,
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks
for Image Captioning,
CVPR17(6298-6306)
IEEE DOI
1711
Detectors, Feature extraction, Image coding, Neural networks,
Semantics, Visualization
BibRef
Sun, Q.,
Lee, S.,
Batra, D.,
Bidirectional Beam Search: Forward-Backward Inference in Neural
Sequence Models for Fill-in-the-Blank Image Captioning,
CVPR17(7215-7223)
IEEE DOI
1711
Approximation algorithms, Computational modeling, Decoding,
History, Inference algorithms, Recurrent, neural, networks
BibRef
Wang, Y.,
Lin, Z.,
Shen, X.,
Cohen, S.,
Cottrell, G.W.,
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition,
CVPR17(7378-7387)
IEEE DOI
1711
Measurement, Recurrent neural networks, SPICE, Semantics, Skeleton, Training
BibRef
Zanfir, M.[Mihai],
Marinoiu, E.[Elisabeta],
Sminchisescu, C.[Cristian],
Spatio-Temporal Attention Models for Grounded Video Captioning,
ACCV16(IV: 104-119).
Springer DOI
1704
BibRef
Chen, T.H.[Tseng-Hung],
Zeng, K.H.[Kuo-Hao],
Hsu, W.T.[Wan-Ting],
Sun, M.[Min],
Video Captioning via Sentence Augmentation and Spatio-Temporal
Attention,
Assist16(I: 269-286).
Springer DOI
1704
BibRef
Weiland, L.[Lydia],
Hulpus, I.[Ioana],
Ponzetto, S.P.[Simone Paolo],
Dietz, L.[Laura],
Using Object Detection, NLP, and Knowledge Bases to Understand the
Message of Images,
MMMod17(II: 405-418).
Springer DOI
1701
BibRef
Liu, Y.[Yu],
Guo, Y.M.[Yan-Ming],
Lew, M.S.[Michael S.],
What Convnets Make for Image Captioning?,
MMMod17(I: 416-428).
Springer DOI
1701
BibRef
Tran, K.,
He, X.,
Zhang, L.,
Sun, J.,
Rich Image Captioning in the Wild,
DeepLearn-C16(434-441)
IEEE DOI
1612
BibRef
Wang, Y.L.[Yi-Lin],
Wang, S.H.[Su-Hang],
Tang, J.L.[Ji-Liang],
Liu, H.[Huan],
Li, B.X.[Bao-Xin],
PPP: Joint Pointwise and Pairwise Image Label Prediction,
CVPR16(6005-6013)
IEEE DOI
1612
BibRef
Yatskar, M.[Mark],
Ordonez, V.,
Zettlemoyer, L.[Luke],
Farhadi, A.[Ali],
Commonly Uncommon: Semantic Sparsity in Situation Recognition,
CVPR17(6335-6344)
IEEE DOI
1711
BibRef
Earlier: A1, A3, A4, Only:
Situation Recognition: Visual Semantic Role Labeling for Image
Understanding,
CVPR16(5534-5542)
IEEE DOI
1612
Image recognition, Image representation, Predictive models,
Semantics, Tensile stress, Training
BibRef
Sadhu, A.[Arka],
Gupta, T.[Tanmay],
Yatskar, M.[Mark],
Nevatia, R.[Ram],
Kembhavi, A.[Aniruddha],
Visual Semantic Role Labeling for Video Understanding,
CVPR21(5585-5596)
IEEE DOI
2111
Visualization, Annotations, Semantics,
Benchmark testing, Motion pictures
BibRef
Kottur, S.[Satwik],
Vedantam, R.[Ramakrishna],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
VisualWord2Vec (Vis-W2V):
Learning Visually Grounded Word Embeddings Using Abstract Scenes,
CVPR16(4985-4994)
IEEE DOI
1612
BibRef
Zhu, Y.,
Groth, O.,
Bernstein, M.,
Fei-Fei, L.,
Visual7W: Grounded Question Answering in Images,
CVPR16(4995-5004)
IEEE DOI
1612
BibRef
Zhang, P.,
Goyal, Y.,
Summers-Stay, D.,
Batra, D.,
Parikh, D.,
Yin and Yang: Balancing and Answering Binary Visual Questions,
CVPR16(5014-5022)
IEEE DOI
1612
BibRef
Park, D.H.,
Darrell, T.J.,
Rohrbach, A.,
Robust Change Captioning,
ICCV19(4623-4632)
IEEE DOI
2004
feature extraction, learning (artificial intelligence),
natural language processing, object-oriented programming, Predictive models
BibRef
Venugopalan, S.[Subhashini],
Hendricks, L.A.[Lisa Anne],
Rohrbach, M.[Marcus],
Mooney, R.[Raymond],
Darrell, T.J.[Trevor J.],
Saenko, K.[Kate],
Captioning Images with Diverse Objects,
CVPR17(1170-1178)
IEEE DOI
1711
BibRef
Earlier: A2, A1, A3, A4, A6, A5:
Deep Compositional Captioning: Describing Novel Object Categories
without Paired Training Data,
CVPR16(1-10)
IEEE DOI
1612
Data models, Image recognition, Predictive models, Semantics,
Training, Visualization.
Novel objects not in training data.
BibRef
Johnson, J.[Justin],
Karpathy, A.[Andrej],
Fei-Fei, L.[Li],
DenseCap:
Fully Convolutional Localization Networks for Dense Captioning,
CVPR16(4565-4574)
IEEE DOI
1612
Both localize and describe salient regions in images in natural language.
BibRef
Lin, X.[Xiao],
Parikh, D.[Devi],
Leveraging Visual Question Answering for Image-Caption Ranking,
ECCV16(II: 261-277).
Springer DOI
1611
BibRef
Earlier:
Don't just listen, use your imagination:
Leveraging visual common sense for non-visual tasks,
CVPR15(2984-2993)
IEEE DOI
1510
BibRef
Chen, T.L.[Tian-Lang],
Zhang, Z.P.[Zhong-Ping],
You, Q.Z.[Quan-Zeng],
Fang, C.[Chen],
Wang, Z.W.[Zhao-Wen],
Jin, H.L.[Hai-Lin],
Luo, J.B.[Jie-Bo],
'Factual' or 'Emotional':
Stylized Image Captioning with Adaptive Learning and Attention,
ECCV18(X: 527-543).
Springer DOI
1810
BibRef
You, Q.Z.[Quan-Zeng],
Jin, H.L.[Hai-Lin],
Wang, Z.W.[Zhao-Wen],
Fang, C.[Chen],
Luo, J.B.[Jie-Bo],
Image Captioning with Semantic Attention,
CVPR16(4651-4659)
IEEE DOI
1612
BibRef
Jia, X.[Xu],
Gavves, E.[Efstratios],
Fernando, B.[Basura],
Tuytelaars, T.[Tinne],
Guiding the Long-Short Term Memory Model for Image Caption Generation,
ICCV15(2407-2415)
IEEE DOI
1602
Computer architecture
BibRef
Chen, X.L.[Xin-Lei],
Zitnick, C.L.[C. Lawrence],
Mind's eye:
A recurrent visual representation for image caption generation,
CVPR15(2422-2431)
IEEE DOI
1510
BibRef
Vedantam, R.[Ramakrishna],
Zitnick, C.L.[C. Lawrence],
Parikh, D.[Devi],
CIDEr: Consensus-based image description evaluation,
CVPR15(4566-4575)
IEEE DOI
1510
BibRef
Fang, H.[Hao],
Gupta, S.[Saurabh],
Iandola, F.[Forrest],
Srivastava, R.K.[Rupesh K.],
Deng, L.[Li],
Dollar, P.[Piotr],
Gao, J.F.[Jian-Feng],
He, X.D.[Xiao-Dong],
Mitchell, M.[Margaret],
Platt, J.C.[John C.],
Zitnick, C.L.[C. Lawrence],
Zweig, G.[Geoffrey],
From captions to visual concepts and back,
CVPR15(1473-1482)
IEEE DOI
1510
BibRef
Ramnath, K.[Krishnan],
Baker, S.[Simon],
Vanderwende, L.[Lucy],
El-Saban, M.[Motaz],
Sinha, S.N.[Sudipta N.],
Kannan, A.[Anitha],
Hassan, N.[Noran],
Galley, M.[Michel],
Yang, Y.[Yi],
Ramanan, D.[Deva],
Bergamo, A.[Alessandro],
Torresani, L.[Lorenzo],
AutoCaption: Automatic caption generation for personal photos,
WACV14(1050-1057)
IEEE DOI
1406
Clouds
BibRef
Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
Image Annotation .