Feng, Y.S.[Yan-Song],
Lapata, M.,
Automatic Caption Generation for News Images,
PAMI(35), No. 4, April 2013, pp. 797-812.
IEEE DOI
1303
Use existing captions and tags, expand to similar images.
BibRef
Nakayama, H.[Hideki],
Harada, T.[Tatsuya],
Kuniyoshi, Y.[Yasuo],
Dense Sampling Low-Level Statistics of Local Features,
IEICE(E93-D), No. 7, July 2010, pp. 1727-1736.
WWW Link.
1008
BibRef
Earlier:
CIVR09(Article No 17).
DOI Link
0907
BibRef
And:
Global Gaussian approach for scene categorization using information
geometry,
CVPR10(2336-2343).
IEEE DOI
1006
BibRef
Earlier:
AI Goggles: Real-time Description and Retrieval in the Real World with
Online Learning,
CRV09(184-191).
IEEE DOI
0905
local features.
Scalability of matching for large-scale indexing.
Boost global features with sampled statistics of local features.
BibRef
Ushiku, Y.[Yoshitaka],
Yamaguchi, M.[Masataka],
Mukuta, Y.[Yusuke],
Harada, T.[Tatsuya],
Common Subspace for Model and Similarity:
Phrase Learning for Caption Generation from Images,
ICCV15(2668-2676)
IEEE DOI
1602
Feature extraction
BibRef
Jin, J.[Jiren],
Nakayama, H.[Hideki],
Annotation order matters:
Recurrent Image Annotator for arbitrary length image tagging,
ICPR16(2452-2457)
IEEE DOI
1705
Correlation, Feature extraction, Indexes, Predictive models,
Recurrent neural networks, Training
BibRef
Harada, T.[Tatsuya],
Nakayama, H.[Hideki],
Kuniyoshi, Y.[Yasuo],
Improving Local Descriptors by Embedding Global and Local Spatial
Information,
ECCV10(IV: 736-749).
Springer DOI
1009
BibRef
Earlier: A2, A1, A3:
Evaluation of dimensionality reduction methods for image
auto-annotation,
BMVC10(xx-yy).
HTML Version.
1009
BibRef
Tariq, A.[Amara],
Foroosh, H.[Hassan],
A Context-Driven Extractive Framework for Generating Realistic Image
Descriptions,
IP(26), No. 2, February 2017, pp. 619-632.
IEEE DOI
1702
image annotation
BibRef
Vinyals, O.[Oriol],
Toshev, A.[Alexander],
Bengio, S.[Samy],
Erhan, D.[Dumitru],
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning
Challenge,
PAMI(39), No. 4, April 2017, pp. 652-663.
IEEE DOI
1703
BibRef
Earlier:
Show and tell: A neural image caption generator,
CVPR15(3156-3164)
IEEE DOI
1510
Computational modeling
BibRef
Gao, L.L.[Lian-Li],
Guo, Z.[Zhao],
Zhang, H.W.[Han-Wang],
Xu, X.[Xing],
Shen, H.T.[Heng Tao],
Video Captioning With Attention-Based LSTM and Semantic Consistency,
MultMed(19), No. 9, September 2017, pp. 2045-2055.
IEEE DOI
1708
Computational modeling, Correlation, Feature extraction,
Neural networks, Semantics,
Visualization, Attention mechanism, embedding,
long short-term memory (LSTM), video, captioning
BibRef
Hu, M.,
Yang, Y.,
Shen, F.,
Zhang, L.,
Shen, H.T.,
Li, X.,
Robust Web Image Annotation via Exploring Multi-Facet and Structural
Knowledge,
IP(26), No. 10, October 2017, pp. 4871-4884.
IEEE DOI
1708
image annotation, image retrieval, iterative methods,
learning (artificial intelligence), multimedia systems,
optimisation, pattern classification, RMSL,
data structural information,
digital technologies,
image semantic indexing, image semantic retrieval,
robust multiview semi-supervised learning, visual features,
Manifolds, Multimedia communication, Semantics,
Semisupervised learning, Supervised learning, Image annotation,
l2, p-norm, multi-view learning, semi-supervised learning
BibRef
Bin, Y.,
Yang, Y.,
Shen, F.,
Xie, N.,
Shen, H.T.,
Li, X.,
Describing Video With Attention-Based Bidirectional LSTM,
Cyber(49), No. 7, July 2019, pp. 2631-2641.
IEEE DOI
1905
Visualization, Semantics, Decoding, Feature extraction,
Natural languages, Recurrent neural networks, Grammar,
video captioning
BibRef
Wang, J.Y.[Jing-Ya],
Zhu, X.T.[Xia-Tian],
Gong, S.G.[Shao-Gang],
Discovering visual concept structure with sparse and incomplete tags,
AI(250), No. 1, 2017, pp. 16-36.
Elsevier DOI
1708
Automatically discovering the semantic structure of tagged visual data
(e.g. web videos and images).
BibRef
Kilickaya, M.[Mert],
Akkus, B.K.[Burak Kerim],
Cakici, R.[Ruket],
Erdem, A.[Aykut],
Erdem, E.[Erkut],
Ikizler-Cinbis, N.[Nazli],
Data-driven image captioning via salient region discovery,
IET-CV(11), No. 6, September 2017, pp. 398-406.
DOI Link
1709
BibRef
Fu, K.[Kun],
Jin, J.Q.[Jun-Qi],
Cui, R.P.[Run-Peng],
Sha, F.[Fei],
Zhang, C.S.[Chang-Shui],
Aligning Where to See and What to Tell: Image Captioning with
Region-Based Attention and Scene-Specific Contexts,
PAMI(39), No. 12, December 2017, pp. 2321-2334.
IEEE DOI
1711
Adaptation models, Computational modeling, Context modeling,
Data mining, Feature extraction, Image classification,
Visualization, Image captioning, LSTM,
visual attention.
BibRef
Xiao, C.M.[Chang-Ming],
Yang, Q.[Qi],
Xu, X.Q.[Xiao-Qiang],
Zhang, J.W.[Jian-Wei],
Zhou, F.[Feng],
Zhang, C.S.[Chang-Shui],
Where you edit is what you get: Text-guided image editing with
region-based attention,
PR(139), 2023, pp. 109458.
Elsevier DOI
2304
Generative adversarial networks, Text-guided image editing,
Spatial disentanglement
BibRef
Nian, F.D.[Fu-Dong],
Li, T.[Teng],
Wang, Y.[Yan],
Wu, X.Y.[Xin-Yu],
Ni, B.B.[Bing-Bing],
Xu, C.S.[Chang-Sheng],
Learning explicit video attributes from mid-level representation for
video captioning,
CVIU(163), No. 1, 2017, pp. 126-138.
Elsevier DOI
1712
Mid-level video representation
BibRef
He, X.D.[Xiao-Dong],
Deng, L.[Li],
Deep Learning for Image-to-Text Generation: A Technical Overview,
SPMag(34), No. 6, November 2017, pp. 109-116.
IEEE DOI
1712
BibRef
And:
Errata:
SPMag(35), No. 1, January 2018, pp. 178.
IEEE DOI Artificial intelligence, Image classification,
Natural language processing, Pediatrics, Semantics, Training data,
Visualization
BibRef
Li, L.H.[Ling-Hui],
Tang, S.[Sheng],
Zhang, Y.D.[Yong-Dong],
Deng, L.X.[Li-Xi],
Tian, Q.[Qi],
GLA: Global-Local Attention for Image Description,
MultMed(20), No. 3, March 2018, pp. 726-737.
IEEE DOI
1802
Computational modeling, Decoding, Feature extraction,
Image recognition, Natural language processing,
recurrent neural network
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Li, X.,
Exploring Models and Data for Remote Sensing Image Caption Generation,
GeoRS(56), No. 4, April 2018, pp. 2183-2195.
IEEE DOI
1804
Feature extraction, Image representation,
Recurrent neural networks, Remote sensing, Semantics,
semantic understanding
BibRef
Cheng, Q.[Qimin],
Zhang, Q.[Qian],
Fu, P.[Peng],
Tu, C.H.[Cong-Huan],
Li, S.[Sen],
A survey and analysis on automatic image annotation,
PR(79), 2018, pp. 242-259.
Elsevier DOI
1804
Automatic image annotation, Generative model,
Nearest-neighbor model, Discriminative model, Tag-completion, Deep learning
BibRef
Ben Rejeb, I.[Imen],
Ouni, S.[Sonia],
Barhoumi, W.[Walid],
Zagrouba, E.[Ezzeddine],
Fuzzy VA-Files for multi-label image annotation based on visual content
of regions,
SIViP(12), No. 5, July 2018, pp. 877-884.
Springer DOI
1806
Vector Approximation Files.
BibRef
Helmy, T.[Tarek],
A Generic Framework for Semantic Annotation of Images,
IJIG(18), No. 3, July 2018, pp. Article 1850013.
DOI Link
1807
BibRef
Wu, C.L.[Chun-Lei],
Wei, Y.[Yiwei],
Chu, X.L.[Xiao-Liang],
Su, F.[Fei],
Wang, L.[Leiquan],
Modeling visual and word-conditional semantic attention for image
captioning,
SP:IC(67), 2018, pp. 100-107.
Elsevier DOI
1808
Image captioning, Word-conditional semantic attention,
Visual attention, Attention variation
BibRef
Ye, S.,
Han, J.,
Liu, N.,
Attentive Linear Transformation for Image Captioning,
IP(27), No. 11, November 2018, pp. 5514-5524.
IEEE DOI
1809
feature extraction, image classification,
learning (artificial intelligence), matrix algebra, probability,
LSTM
BibRef
Zhang, M.,
Yang, Y.,
Zhang, H.,
Ji, Y.,
Shen, H.T.,
Chua, T.,
More is Better: Precise and Detailed Image Captioning Using Online
Positive Recall and Missing Concepts Mining,
IP(28), No. 1, January 2019, pp. 32-44.
IEEE DOI
1810
data mining, image representation, image retrieval,
image segmentation, learning (artificial intelligence),
element-wise selection
BibRef
Hu, J.[Jiwei],
Lam, K.M.[Kin-Man],
Lou, P.[Ping],
Liu, Q.[Quan],
Deng, W.P.[Wu-Peng],
Can a machine have two systems for recognition, like human beings?,
JVCIR(56), 2018, pp. 275-286.
Elsevier DOI
1811
Image annotation, Multi-labeling, Hierarchical tree structure,
Feature-pool selection
BibRef
Bhagat, P.K.,
Choudhary, P.,
Image annotation: Then and now,
IVC(80), 2018, pp. 1-23.
Elsevier DOI
1812
Image annotation, Automatic image annotation,
Multi-label classification, Image labeling, Image tagging,
Image retrieval
BibRef
Gil-Gonzalez, J.,
Alvarez-Meza, A.,
Orozco-Gutierrez, A.,
Learning from multiple annotators using kernel alignment,
PRL(116), 2018, pp. 150-156.
Elsevier DOI
1812
Multiple annotators, Kernel methods, Classification
BibRef
Bazrafkan, S.[Shabab],
Javidnia, H.[Hossein],
Corcoran, P.[Peter],
Latent space mapping for generation of object elements with
corresponding data annotation,
PRL(116), 2018, pp. 179-186.
Elsevier DOI
1812
Generative models, Latent space mapping, Deep neural networks
BibRef
Gella, S.[Spandana],
Keller, F.[Frank],
Lapata, M.[Mirella],
Disambiguating Visual Verbs,
PAMI(41), No. 2, February 2019, pp. 311-322.
IEEE DOI
1901
Given an image and a verb, assign the correct sense of the verb.
Visualization, Image recognition, Semantics,
Natural language processing, Horses, Bicycles,
BibRef
Xu, N.[Ning],
Liu, A.A.[An-An],
Liu, J.[Jing],
Nie, W.Z.[Wei-Zhi],
Su, Y.T.[Yu-Ting],
Scene graph captioner:
Image captioning based on structural visual representation,
JVCIR(58), 2019, pp. 477-485.
Elsevier DOI
1901
Image captioning, Scene graph, Structural representation, Attention
BibRef
Jiu, M.Y.[Ming-Yuan],
Sahbi, H.[Hichem],
Deep representation design from deep kernel networks,
PR(88), 2019, pp. 447-457.
Elsevier DOI
1901
Multiple kernel learning, Kernel design, Deep networks,
Efficient computation, Image annotation
BibRef
He, X.W.[Xin-Wei],
Shi, B.G.[Bao-Guang],
Bai, X.[Xiang],
Xia, G.S.[Gui-Song],
Zhang, Z.X.[Zhao-Xiang],
Dong, W.S.[Wei-Sheng],
Image Caption Generation with Part of Speech Guidance,
PRL(119), 2019, pp. 229-237.
Elsevier DOI
1902
Image caption generation, Part-of-speech tags,
Long Short-Term Memory, Visual attributes
BibRef
Xiao, X.Y.[Xin-Yu],
Wang, L.F.[Ling-Feng],
Ding, K.[Kun],
Xiang, S.M.[Shi-Ming],
Pan, C.[Chunhong],
Dense semantic embedding network for image captioning,
PR(90), 2019, pp. 285-296.
Elsevier DOI
1903
Image captioning, Retrieval, High-level semantic information,
Visual concept, Densely embedding, Long short-term memory
BibRef
Foumani, S.N.M.[Seyed Navid Mohammadi],
Nickabadi, A.[Ahmad],
A probabilistic topic model using deep visual word representation for
simultaneous image classification and annotation,
JVCIR(59), 2019, pp. 195-203.
Elsevier DOI
1903
Image classification and annotation, Topic models,
Probabilistic model, Deep learning,
LLC
BibRef
Zhang, X.R.[Xiang-Rong],
Wang, X.[Xin],
Tang, X.[Xu],
Zhou, H.Y.[Hui-Yu],
Li, C.[Chen],
Description Generation for Remote Sensing Images Using Attribute
Attention Mechanism,
RS(11), No. 6, 2019, pp. xx-yy.
DOI Link
1903
BibRef
Zheng, H.[He],
Wu, J.H.[Jia-Hong],
Liang, R.[Rui],
Li, Y.[Ye],
Li, X.Z.[Xu-Zhi],
Multi-task learning for captioning images with novel words,
IET-CV(13), No. 3, April 2019, pp. 294-301.
DOI Link
1904
BibRef
Ding, S.T.[Song-Tao],
Qu, S.[Shiru],
Xi, Y.L.[Yu-Ling],
Sangaiah, A.K.[Arun Kumar],
Wan, S.H.[Shao-Hua],
Image caption generation with high-level image features,
PRL(123), 2019, pp. 89-95.
Elsevier DOI
1906
Image captioning, Language model,
Bottom-up attention mechanism, Faster R-CNN
BibRef
Liu, X.X.[Xiao-Xiao],
Xu, Q.Y.[Qing-Yang],
Wang, N.[Ning],
A survey on deep neural network-based image captioning,
VC(35), No. 3, March 2019, pp. 445-470.
WWW Link.
1906
BibRef
Hossain, M.Z.[Md. Zakir],
Sohel, F.[Ferdous],
Shiratuddin, M.F.[Mohd Fairuz],
Laga, H.[Hamid],
A Comprehensive Survey of Deep Learning for Image Captioning,
Surveys(51), No. 6, February 2019, pp. Article No 118.
DOI Link
1906
Survey, Captioning.
BibRef
Peng, Y.Q.[Yu-Qing],
Liu, X.[Xuan],
Wang, W.H.[Wei-Hua],
Zhao, X.S.[Xiao-Song],
Wei, M.[Ming],
Image caption model of double LSTM with scene factors,
IVC(86), 2019, pp. 38-44.
Elsevier DOI
1906
Image caption, Deep neural network, Scene recognition, Semantic information
BibRef
Zhang, J.J.[Jun-Jie],
Wu, Q.[Qi],
Zhang, J.[Jian],
Shen, C.H.[Chun-Hua],
Lu, J.F.[Jian-Feng],
Wu, Q.A.[Qi-Ang],
Heritage image annotation via collective knowledge,
PR(93), 2019, pp. 204-214.
Elsevier DOI
1906
Annotation diversity, Image annotation,
Representation learning, Collective knowledge, Heritage image collection
BibRef
Verma, Y.[Yashaswi],
Diverse image annotation with missing labels,
PR(93), 2019, pp. 470-484.
Elsevier DOI
1906
Image annotation, Diverse labels, Missing labels, Nearest neighbour
BibRef
Markatopoulou, F.,
Mezaris, V.,
Patras, I.,
Implicit and Explicit Concept Relations in Deep Neural Networks for
Multi-Label Video/Image Annotation,
CirSysVideo(29), No. 6, June 2019, pp. 1631-1644.
IEEE DOI
1906
Task analysis, Correlation, Standards, Training,
Neural networks, Semantics, Video/image concept annotation,
video analysis
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
High-Quality Image Captioning With Fine-Grained and Semantic-Guided
Visual Attention,
MultMed(21), No. 7, July 2019, pp. 1681-1693.
IEEE DOI
1906
BibRef
Earlier:
Fine-Grained and Semantic-Guided Visual Attention for Image
Captioning,
WACV18(1709-1717)
IEEE DOI
1806
Visualization, Semantics, Feature extraction, Decoding,
Task analysis, Object oriented modeling, Image resolution,
fully convolutional network-long short term memory framework.
feedforward neural nets, image representation,
image segmentation, convolutional neural network,
Visualization
BibRef
Laib, L.[Lakhdar],
Allili, M.S.[Mohand Saïd],
Ait-Aoudia, S.[Samy],
A probabilistic topic model for event-based image classification and
multi-label annotation,
SP:IC(76), 2019, pp. 283-294.
Elsevier DOI
1906
Event recognition, Image annotation, Topic modeling, Convolutional neural nets
BibRef
Olaode, A.[Abass],
Naghdy, G.[Golshah],
Review of the application of machine learning to the automatic semantic
annotation of images,
IET-IPR(13), No. 8, 20 June 2019, pp. 1232-1245.
DOI Link
1906
BibRef
Li, X.,
Jiang, S.,
Know More Say Less: Image Captioning Based on Scene Graphs,
MultMed(21), No. 8, August 2019, pp. 2117-2130.
IEEE DOI
1908
convolutional neural nets, feature extraction, graph theory,
image representation, learning (artificial intelligence),
vision-language
BibRef
Zhang, C.J.[Chun-Jie],
Cheng, J.[Jian],
Tian, Q.[Qi],
Multiview, Few-Labeled Object Categorization by Predicting Labels
With View Consistency,
Cyber(49), No. 11, November 2019, pp. 3834-3843.
IEEE DOI
1908
image annotation, image classification,
learning (artificial intelligence), mapping function,
view consistency
BibRef
Sharif, N.[Naeha],
White, L.[Lyndon],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
LCEval: Learned Composite Metric for Caption Evaluation,
IJCV(127), No. 10, October 2019, pp. 1586-1610.
Springer DOI
1909
Fine-grained analysis.
BibRef
Zhang, Z.Y.[Zheng-Yuan],
Diao, W.H.[Wen-Hui],
Zhang, W.K.[Wen-Kai],
Yan, M.L.[Meng-Long],
Gao, X.[Xin],
Sun, X.[Xian],
LAM: Remote Sensing Image Captioning with Label-Attention Mechanism,
RS(11), No. 20, 2019, pp. xx-yy.
DOI Link
1910
BibRef
Fu, K.[Kun],
Li, Y.[Yang],
Zhang, W.K.[Wen-Kai],
Yu, H.F.[Hong-Feng],
Sun, X.[Xian],
Boosting Memory with a Persistent Memory Mechanism for Remote Sensing
Image Captioning,
RS(12), No. 11, 2020, pp. xx-yy.
DOI Link
2006
BibRef
Tan, J.H.,
Chan, C.S.,
Chuah, J.H.,
COMIC: Toward A Compact Image Captioning Model With Attention,
MultMed(21), No. 10, October 2019, pp. 2686-2696.
IEEE DOI
1910
embedded systems; feature extraction; image retrieval; matrix algebra.
BibRef
Zhou, L.,
Zhang, Y.,
Jiang, Y.,
Zhang, T.,
Fan, W.,
Re-Caption: Saliency-Enhanced Image Captioning Through Two-Phase
Learning,
IP(29), No. 1, 2020, pp. 694-709.
IEEE DOI
1910
feature extraction, image processing,
learning (artificial intelligence),
visual attribute
BibRef
Yang, L.[Liang],
Hu, H.F.[Hai-Feng],
Visual Skeleton and Reparative Attention for Part-of-Speech image
captioning system,
CVIU(189), 2019, pp. 102819.
Elsevier DOI
1911
Neural network, Visual attention, Image captioning
BibRef
Wang, J.B.[Jun-Bo],
Wang, W.[Wei],
Wang, L.[Liang],
Wang, Z.Y.[Zhi-Yong],
Feng, D.D.[David Dagan],
Tan, T.N.[Tie-Niu],
Learning visual relationship and context-aware attention for image
captioning,
PR(98), 2020, pp. 107075.
Elsevier DOI
1911
Image captioning, Relational reasoning, Context-aware attention
BibRef
Xiao, X.,
Wang, L.,
Ding, K.,
Xiang, S.,
Pan, C.,
Deep Hierarchical Encoder-Decoder Network for Image Captioning,
MultMed(21), No. 11, November 2019, pp. 2942-2956.
IEEE DOI
1911
Visualization, Semantics, Hidden Markov models, Decoding,
Logic gates, Training, Computer architecture,
vision-sentence
BibRef
Jiang, T.[Teng],
Zhang, Z.[Zehan],
Yang, Y.[Yupu],
Modeling coverage with semantic embedding for image caption generation,
VC(35), No. 11, November 2018, pp. 1655-1665.
WWW Link.
1911
BibRef
Tang, C.,
Liu, X.,
Wang, P.,
Zhang, C.,
Li, M.,
Wang, L.,
Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image
Annotation,
MultMed(21), No. 11, November 2019, pp. 2837-2849.
IEEE DOI
1911
Image annotation, Semisupervised learning, Semantics,
Computational modeling, Task analysis, Training, Computer science,
feature projection
BibRef
Mundnich, K.[Karel],
Booth, B.M.[Brandon M.],
Girault, B.[Benjamin],
Narayanan, S.[Shrikanth],
Generating labels for regression of subjective constructs using
triplet embeddings,
PRL(128), 2019, pp. 385-392.
Elsevier DOI
1912
Continuous-time annotations, Annotation fusion,
Inter-rater agreement, Triplet embeddings, Ordinal embeddings
BibRef
Lu, X.,
Wang, B.,
Zheng, X.,
Sound Active Attention Framework for Remote Sensing Image Captioning,
GeoRS(58), No. 3, March 2020, pp. 1985-2000.
IEEE DOI
2003
Active attention, remote sensing image captioning, semantic understanding
BibRef
Wu, L.,
Xu, M.,
Wang, J.,
Perry, S.,
Recall What You See Continually Using GridLSTM in Image Captioning,
MultMed(22), No. 3, March 2020, pp. 808-818.
IEEE DOI
2003
Visualization, Decoding, Task analysis, Neural networks, Training,
Computational modeling, Logic gates, Image captioning,
GridLSTM, recurrent neural network
BibRef
Li, Y.Y.[Yang-Yang],
Fang, S.K.[Shuang-Kang],
Jiao, L.C.[Li-Cheng],
Liu, R.J.[Rui-Jiao],
Shang, R.H.[Rong-Hua],
A Multi-Level Attention Model for Remote Sensing Image Captions,
RS(12), No. 6, 2020, pp. xx-yy.
DOI Link
2003
What are the important things in the image.
BibRef
Chaudhary, C.,
Goyal, P.,
Prasad, D.N.,
Chen, Y.P.,
Enhancing the Quality of Image Tagging Using a Visio-Textual
Knowledge Base,
MultMed(22), No. 4, April 2020, pp. 897-911.
IEEE DOI
2004
Knowledge based systems, Visualization, Image annotation,
Encyclopedias, Electronic publishing, Internet, Tagging,
knowledge based systems
BibRef
Chen, X.H.[Xing-Han],
Zhang, M.X.[Ming-Xing],
Wang, Z.[Zheng],
Zuo, L.[Lin],
Li, B.[Bo],
Yang, Y.[Yang],
Leveraging unpaired out-of-domain data for image captioning,
PRL(132), 2020, pp. 132-140.
Elsevier DOI
2005
Image captioning, Out-of-domain data, Deep learning
BibRef
Xu, N.,
Zhang, H.,
Liu, A.,
Nie, W.,
Su, Y.,
Nie, J.,
Zhang, Y.,
Multi-Level Policy and Reward-Based Deep Reinforcement Learning
Framework for Image Captioning,
MultMed(22), No. 5, May 2020, pp. 1372-1383.
IEEE DOI
2005
Visualization, Measurement, Task analysis, Reinforcement learning,
Optimization, Adaptation models, Semantics, Multi-level policy,
image captioning
BibRef
Deng, Z.R.[Zhen-Rong],
Jiang, Z.Q.[Zhou-Qin],
Lan, R.[Rushi],
Huang, W.M.[Wen-Ming],
Luo, X.N.[Xiao-Nan],
Image captioning using DenseNet network and adaptive attention,
SP:IC(85), 2020, pp. 115836.
Elsevier DOI
2005
Image captioning, DenseNet, LSTM, Adaptive attention mechanism
BibRef
Ji, J.,
Xu, C.,
Zhang, X.,
Wang, B.,
Song, X.,
Spatio-Temporal Memory Attention for Image Captioning,
IP(29), 2020, pp. 7615-7628.
IEEE DOI
2007
Image captioning, spatio-temporal relationship,
attention transmission, memory attention, LSTM
BibRef
Guo, L.,
Liu, J.,
Lu, S.,
Lu, H.,
Show, Tell, and Polish: Ruminant Decoding for Image Captioning,
MultMed(22), No. 8, August 2020, pp. 2149-2162.
IEEE DOI
2007
Decoding, Visualization, Planning, Training, Semantics,
Reinforcement learning, Task analysis, Image captioning,
rumination
BibRef
Khatchatoorian, A.G.[Artin Ghostan],
Jamzad, M.[Mansour],
Architecture to improve the accuracy of automatic image annotation
systems,
IET-CV(14), No. 5, August 2020, pp. 214-223.
DOI Link
2007
BibRef
Theodosiou, Z.[Zenonas],
Tsapatsoulis, N.[Nicolas],
Image annotation: the effects of content, lexicon and annotation method,
MultInfoRetr(9), No. 3, September 2020, pp. 191-203.
WWW Link.
2008
BibRef
Che, W.B.[Wen-Bin],
Fan, X.P.[Xiao-Peng],
Xiong, R.Q.[Rui-Qin],
Zhao, D.B.[De-Bin],
Visual Relationship Embedding Network for Image Paragraph Generation,
MultMed(22), No. 9, September 2020, pp. 2307-2320.
IEEE DOI
2008
Visualization, Semantics, Task analysis, Proposals, Automobiles,
Buildings, Paragraph generation, image caption, LSTM
BibRef
Feng, Q.,
Wu, Y.,
Fan, H.,
Yan, C.,
Xu, M.,
Yang, Y.,
Cascaded Revision Network for Novel Object Captioning,
CirSysVideo(30), No. 10, October 2020, pp. 3413-3421.
IEEE DOI
2010
Visualization, Semantics, Task analysis, Detectors, Training,
Knowledge engineering, Feature extraction, Captioning,
semantic matching
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Ma, H.F.[Hui-Fang],
The synergy of double attention: Combine sentence-level and
word-level attention for image captioning,
CVIU(201), 2020, pp. 103068.
Elsevier DOI
2011
Image captioning, Sentence-level attention,
Word-level attention, Reinforcement learning
BibRef
Shilpa, M.[Mohankumar],
He, J.[Jun],
Zhao, Y.[Yijia],
Sun, B.[Bo],
Yu, L.J.[Le-Jun],
Feedback evaluations to promote image captioning,
IET-IPR(14), No. 13, November 2020, pp. 3021-3027.
DOI Link
2012
BibRef
Zhang, J.,
Mei, K.,
Zheng, Y.,
Fan, J.,
Integrating Part of Speech Guidance for Image Captioning,
MultMed(23), 2021, pp. 92-104.
IEEE DOI
2012
Visualization, Predictive models, Semantics, Feature extraction,
Task analysis, Speech processing, Part of speech,
multi-task learning
BibRef
Sharif, N.[Naeha],
Jalwana, M.A.A.K.[Mohammad A.A.K.],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
Leveraging Linguistically-aware Object Relations and NASNet for Image
Captioning,
IVCNZ20(1-6)
IEEE DOI
2012
Visualization, Semantics, Pipelines, Computer architecture,
Knowledge discovery, Feature extraction, Task analysis,
NASNet
BibRef
Gouthaman, K.V.,
Nambiar, A.[Athira],
Srinivas, K.S.[Kancheti Sai],
Mittal, A.[Anurag],
Linguistically-aware attention for reducing the semantic gap in
vision-language tasks,
PR(112), 2021, pp. 107812.
Elsevier DOI
2102
Attention models, Visual question answering,
Counting in visual question answering, Image captioning
BibRef
Liu, H.,
Zhang, S.,
Lin, K.,
Wen, J.,
Li, J.,
Hu, X.,
Vocabulary-Wide Credit Assignment for Training Image Captioning
Models,
IP(30), 2021, pp. 2450-2460.
IEEE DOI
2102
Training, Measurement, Task analysis, Vocabulary, Feature extraction,
Maximum likelihood estimation, Adaptation models
BibRef
Xu, N.[Ning],
Tian, H.S.[Hong-Shuo],
Wang, Y.H.[Yan-Hui],
Nie, W.Z.[Wei-Zhi],
Song, D.[Dan],
Liu, A.A.[An-An],
Liu, W.[Wu],
Coupled-dynamic learning for vision and language:
Exploring Interaction between different tasks,
PR(113), 2021, pp. 107829.
Elsevier DOI
2103
Image captioning, Image synthesis, Coupled dynamics
BibRef
Zhang, J.[Jing],
Li, K.K.[Kang-Kang],
Wang, Z.[Zhe],
Parallel-fusion LSTM with synchronous semantic and visual information
for image captioning,
JVCIR(75), 2021, pp. 103044.
Elsevier DOI
2103
Image captioning, Parallel-fusion LSTM, Attention mechanism, Guiding LSTM
BibRef
Yang, L.,
Wang, H.,
Tang, P.,
Li, Q.,
CaptionNet: A Tailor-made Recurrent Neural Network for Generating
Image Descriptions,
MultMed(23), 2021, pp. 835-845.
IEEE DOI
2103
Visualization, Feature extraction, Semantics, Task analysis,
Predictive models, Computational modeling,
reinforcement learning
BibRef
Liu, A.A.[An-An],
Wang, Y.H.[Yan-Hui],
Xu, N.[Ning],
Liu, S.[Shan],
Li, X.[Xuanya],
Scene-Graph-Guided message passing network for dense captioning,
PRL(145), 2021, pp. 187-193.
Elsevier DOI
2104
Scene graph, Dense captioning, Message passing
BibRef
Zhang, L.[Le],
Zhang, Y.S.[Yan-Shuo],
Zhao, X.[Xin],
Zou, Z.X.[Ze-Xiao],
Image captioning via proximal policy optimization,
IVC(108), 2021, pp. 104126.
Elsevier DOI
2104
Image captioning, Reinforcement learning, Proximal policy optimization
BibRef
Ji, J.Z.[Jun-Zhong],
Du, Z.R.[Zhuo-Ran],
Zhang, X.D.[Xiao-Dan],
Divergent-convergent attention for image captioning,
PR(115), 2021, pp. 107928.
Elsevier DOI
2104
Image Captioning, Divergent Observation, Convergent Attention
BibRef
Wei, Y.W.[Yi-Wei],
Wu, C.L.[Chun-Lei],
Jia, Z.Y.[Zhi-Yang],
Hu, X.[XuFei],
Guo, S.[Shuang],
Shi, H.T.[Hai-Tao],
Past is important: Improved image captioning by looking back in time,
SP:IC(94), 2021, pp. 116183.
Elsevier DOI
2104
Image captioning, Reinforcement learning, Visual attention
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
Exploring region relationships implicitly:
Image captioning with visual relationship attention,
IVC(109), 2021, pp. 104146.
Elsevier DOI
2105
Image captioning, Visual relationship attention,
Relationship-level attention parallel attention mechanism,
Learned spatial constraint
BibRef
Zhang, Z.J.[Zong-Jian],
Wu, Q.[Qiang],
Wang, Y.[Yang],
Chen, F.[Fang],
Exploring Pairwise Relationships Adaptively From Linguistic Context
in Image Captioning,
MultMed(24), 2022, pp. 3101-3113.
IEEE DOI
2206
Visualization, Linguistics, Decoding, Modulation, Context modeling,
Adaptation models, Semantics, Bilinear attention,
visual relationship attention
BibRef
Li, X.L.[Xue-Long],
Zhang, X.T.[Xue-Ting],
Huang, W.[Wei],
Wang, Q.[Qi],
Truncation Cross Entropy Loss for Remote Sensing Image Captioning,
GeoRS(59), No. 6, June 2021, pp. 5246-5257.
IEEE DOI
2106
Feature extraction, Remote sensing, Entropy, Semantics, Decoding,
Optimization, Visualization, Image captioning, overfitting,
truncation cross entropy (TCE) loss
BibRef
He, S.[Shan],
Lu, Y.Y.[Yuan-Yao],
Chen, S.N.[Sheng-Nan],
Image Captioning Algorithm Based on Multi-Branch CNN and Bi-LSTM,
IEICE(E104-D), No. 7, July 2021, pp. 941-947.
WWW Link.
2107
BibRef
Zhong, X.[Xian],
Nie, G.Z.[Guo-Zhang],
Huang, W.X.[Wen-Xin],
Liu, W.X.[Wen-Xuan],
Ma, B.[Bo],
Lin, C.W.[Chia-Wen],
Attention-guided image captioning with adaptive global and local
feature fusion,
JVCIR(78), 2021, pp. 103138.
Elsevier DOI
2107
Image captioning, Encoder-decoder, Spatial information, Adaptive attention
BibRef
Sumbul, G.[Gencer],
Nayak, S.[Sonali],
Demir, B.[Begüm],
SD-RSIC: Summarization-Driven Deep Remote Sensing Image Captioning,
GeoRS(59), No. 8, August 2021, pp. 6922-6934.
IEEE DOI
2108
Training, Standards, Semantics, Feature extraction, Remote sensing,
Neural networks, Task analysis, Caption summarization,
remote sensing (RS)
BibRef
Wu, J.[Jie],
Chen, T.S.[Tian-Shui],
Wu, H.F.[He-Feng],
Yang, Z.[Zhi],
Luo, G.C.[Guang-Chun],
Lin, L.[Liang],
Fine-Grained Image Captioning With Global-Local Discriminative
Objective,
MultMed(23), 2021, pp. 2413-2427.
IEEE DOI
2108
Training, Visualization, Task analysis, Semantics,
Reinforcement learning, Pipelines, Maximum likelihood estimation,
Self-retrieval
BibRef
Wu, L.X.[Ling-Xiang],
Xu, M.[Min],
Sang, L.[Lei],
Yao, T.[Ting],
Mei, T.[Tao],
Noise Augmented Double-Stream Graph Convolutional Networks for Image
Captioning,
CirSysVideo(31), No. 8, August 2021, pp. 3118-3127.
IEEE DOI
2108
Visualization, Training, Generators, Reinforcement learning,
Decoding, Streaming media, Recurrent neural networks, Captioning,
adaptive noise
BibRef
Nivedita, M.,
Chandrashekar, P.[Priyanka],
Mahapatra, S.[Shibani],
Phamila, Y.A.V.[Y. Asnath Victy],
Selvaperumal, S.K.[Sathish Kumar],
Image Captioning for Video Surveillance System using Neural Networks,
IJIG(21), No. 4, October 2021 2021, pp. 2150044.
DOI Link
2110
BibRef
Haghighi, F.[Fatemeh],
Taher, M.R.H.[Mohammad Reza Hosseinzadeh],
Zhou, Z.W.[Zong-Wei],
Gotway, M.B.[Michael B.],
Liang, J.M.[Jian-Ming],
Transferable Visual Words: Exploiting the Semantics of Anatomical
Patterns for Self-Supervised Learning,
MedImg(40), No. 10, October 2021, pp. 2857-2868.
IEEE DOI
2110
WWW Link.
Code, Visual Worlds. Medical image annotation.
Visualization, Semantics, Image representation, Feature extraction,
Biomedical imaging, Annotations, Training,
and 3D pre-trained models
BibRef
Wang, Q.[Qi],
Huang, W.[Wei],
Zhang, X.[Xueting],
Li, X.L.[Xue-Long],
Word-Sentence Framework for Remote Sensing Image Captioning,
GeoRS(59), No. 12, December 2021, pp. 10532-10543.
IEEE DOI
2112
Remote sensing, Feature extraction, Generators, Decoding,
Task analysis, Visualization, Semantics, Deep learning,
word-sentence framework
BibRef
Wan, B.Y.[Bo-Yang],
Jiang, W.H.[Wen-Hui],
Fang, Y.M.[Yu-Ming],
Zhu, M.W.[Min-Wei],
Li, Q.[Qin],
Liu, Y.[Yang],
Revisiting image captioning via maximum discrepancy competition,
PR(122), 2022, pp. 108358.
Elsevier DOI
2112
Image captioning, Model comparison, Attention mechanism
BibRef
Chen, T.Y.[Tian-Yu],
Li, Z.X.[Zhi-Xin],
Wu, J.L.[Jing-Li],
Ma, H.F.[Hui-Fang],
Su, B.P.[Bian-Ping],
Improving image captioning with Pyramid Attention and SC-GAN,
IVC(117), 2022, pp. 104340.
Elsevier DOI
2112
Image captioning, Pyramid Attention network,
Self-critical training, Reinforcement learning, Sequence-level learning
BibRef
Zhou, Y.J.[Yu-Jie],
Long, J.F.[Jie-Feng],
Xu, S.P.[Su-Ping],
Shang, L.[Lin],
Attribute-driven image captioning via soft-switch pointer,
PRL(152), 2021, pp. 34-41.
Elsevier DOI
2112
Image captioning, Visual attributes detection, Attention, Pointing mechanism
BibRef
Zha, Z.J.[Zheng-Jun],
Liu, D.[Daqing],
Zhang, H.[Hanwang],
Zhang, Y.D.[Yong-Dong],
Wu, F.[Feng],
Context-Aware Visual Policy Network for Fine-Grained Image Captioning,
PAMI(44), No. 2, February 2022, pp. 710-722.
IEEE DOI
2201
Visualization, Task analysis, Cognition, Decision making, Training,
Natural languages, Reinforcement learning, Image captioning,
policy network
BibRef
Wang, Q.Z.[Qing-Zhong],
Wan, J.[Jia],
Chan, A.B.[Antoni B.],
On Diversity in Image Captioning: Metrics and Methods,
PAMI(44), No. 2, February 2022, pp. 1035-1049.
IEEE DOI
2201
Measurement, Semantics, Learning (artificial intelligence),
Vegetation, Legged locomotion, Training, Computational modeling,
diversity metric
BibRef
Wang, J.[Jiuniu],
Xu, W.J.[Wen-Jia],
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Compare and Reweight:
Distinctive Image Captioning Using Similar Images Sets,
ECCV20(I:370-386).
Springer DOI
2011
BibRef
Luo, G.F.[Gai-Fang],
Cheng, L.J.[Li-Jun],
Jing, C.[Chao],
Zhao, C.[Can],
Song, G.Z.[Guo-Zhu],
A thorough review of models, evaluation metrics, and datasets on
image captioning,
IET-IPR(16), No. 2, 2022, pp. 311-332.
DOI Link
2201
BibRef
Ben, H.X.[Hui-Xia],
Pan, Y.[Yingwei],
Li, Y.[Yehao],
Yao, T.[Ting],
Hong, R.[Richang],
Wang, M.[Meng],
Mei, T.[Tao],
Unpaired Image Captioning With semantic-Constrained Self-Learning,
MultMed(24), 2022, pp. 904-916.
IEEE DOI
2202
Semantics, Image recognition, Training, Visualization, Decoding,
Task analysis, Dogs, Encoder-decoder networks, image captioning, self-supervised learning
BibRef
Li, Y.[Yehao],
Yao, T.[Ting],
Pan, Y.[Yingwei],
Chao, H.Y.[Hong-Yang],
Mei, T.[Tao],
Pointing Novel Objects in Image Captioning,
CVPR19(12489-12498).
IEEE DOI
2002
BibRef
Liu, M.F.[Mao-Fu],
Hu, H.J.[Hui-Jun],
Li, L.J.[Ling-Jun],
Yu, Y.[Yan],
Guan, W.L.[Wei-Li],
Chinese Image Caption Generation via Visual Attention and Topic
Modeling,
Cyber(52), No. 2, February 2022, pp. 1247-1257.
IEEE DOI
2202
Visualization, Decoding, Semantics, Predictive models,
Feature extraction, Natural language processing,
visual attention
BibRef
Yang, Q.Q.[Qiao-Qiao],
Ni, Z.[Zihao],
Ren, P.[Peng],
Meta captioning:
A meta learning based remote sensing image captioning framework,
PandRS(186), 2022, pp. 190-200.
Elsevier DOI
2203
Remote sensing image captioning, Meta learning
BibRef
Yang, X.[Xu],
Zhang, H.[Hanwang],
Cai, J.F.[Jian-Fei],
Auto-Encoding and Distilling Scene Graphs for Image Captioning,
PAMI(44), No. 5, May 2022, pp. 2313-2327.
IEEE DOI
2204
Visualization, Decoding, Training, Roads, Pipelines, Dictionaries,
Semantics, Image captioning, scene graph, transfer learning,
knowledge distillation
BibRef
Yang, X.[Xu],
Tang, K.[Kaihua],
Zhang, H.[Hanwang],
Cai, J.F.[Jian-Fei],
Auto-Encoding Scene Graphs for Image Captioning,
CVPR19(10677-10686).
IEEE DOI
2002
BibRef
Yang, Z.P.[Zuo-Peng],
Wang, P.B.[Peng-Bo],
Chu, T.S.[Tian-Shu],
Yang, J.[Jie],
Human-Centric Image Captioning,
PR(126), 2022, pp. 108545.
Elsevier DOI
2204
Human-centric, Image captioning, Feature hierarchization
BibRef
Li, X.[Xuan],
Zhang, W.K.[Wen-Kai],
Sun, X.[Xian],
Gao, X.[Xin],
Without detection: Two-step clustering features with local-global
attention for image captioning,
IET-CV(16), No. 3, 2022, pp. 280-294.
DOI Link
2204
BibRef
Yu, L.T.[Li-Tao],
Zhang, J.[Jian],
Wu, Q.[Qiang],
Dual Attention on Pyramid Feature Maps for Image Captioning,
MultMed(24), No. 2022, pp. 1775-1786.
IEEE DOI
2204
Visualization, Decoding, Task analysis, Semantics,
Feature extraction, Context modeling, Image captioning,
pyramid attention
BibRef
Zhang, M.[Min],
Chen, J.X.[Jing-Xiang],
Li, P.F.[Peng-Fei],
Jiang, M.[Ming],
Zhou, Z.[Zhe],
Topic scene graphs for image captioning,
IET-CV(16), No. 4, 2022, pp. 364-375.
DOI Link
2205
natural language processing
BibRef
Yu, Q.[Qiang],
Zhang, C.X.[Chun-Xia],
Weng, L.[Lubin],
Xiang, S.M.[Shi-Ming],
Pan, C.H.[Chun-Hong],
Scene captioning with deep fusion of images and point clouds,
PRL(158), 2022, pp. 9-15.
Elsevier DOI
2205
Scene captioning, Point cloud, Deep fusion
BibRef
Chaudhari, C.P.[Chaitrali Prasanna],
Devane, S.[Satish],
Improved Framework using Rider Optimization Algorithm for Precise Image
Caption Generation,
IJIG(22), No. 2, April 2022, pp. 2250021.
DOI Link
2205
BibRef
Shao, X.J.[Xiang-Jun],
Xiang, Z.L.[Zheng-Long],
Li, Y.X.[Yuan-Xiang],
Zhang, M.J.[Ming-Jie],
Variational joint self-attention for image captioning,
IET-IPR(16), No. 8, 2022, pp. 2075-2086.
DOI Link
2205
BibRef
Li, Y.C.[Yao-Chen],
Wu, C.[Chuan],
Li, L.[Ling],
Liu, Y.H.[Yue-Hu],
Zhu, J.[Jihua],
Caption Generation From Road Images for Traffic Scene Modeling,
ITS(23), No. 7, July 2022, pp. 7805-7816.
IEEE DOI
2207
Semantics, Roads, Visualization, Feature extraction,
Image reconstruction, Vehicle dynamics, Geometric analysis,
visual relationship detection
BibRef
Wang, Y.H.[Yan-Hui],
Xu, N.[Ning],
Liu, A.A.[An-An],
Li, W.H.[Wen-Hui],
Zhang, Y.D.[Yong-Dong],
High-Order Interaction Learning for Image Captioning,
CirSysVideo(32), No. 7, July 2022, pp. 4417-4430.
IEEE DOI
2207
Visualization, Semantics, Feature extraction, Decoding,
Task analysis, Ions, Encoding, Image captioning,
encoder-decoder framework
BibRef
Guo, D.D.[Dan-Dan],
Lu, R.Y.[Rui-Ying],
Chen, B.[Bo],
Zeng, Z.Q.[Ze-Qun],
Zhou, M.Y.[Ming-Yuan],
Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning,
IJCV(130), No. 8, August 2022, pp. 1920-1937.
Springer DOI
2207
BibRef
Demirel, B.[Berkan],
Cinbis, R.G.[Ramazan Gokberk],
Caption generation on scenes with seen and unseen object categories,
IVC(124), 2022, pp. 104515.
Elsevier DOI
2208
Zero-shot learning, Zero-shot image captioning
BibRef
Liu, Z.Y.[Zong-Yin],
Dong, A.M.[An-Ming],
Yu, J.G.[Ji-Guo],
Han, Y.B.[Yu-Bing],
Zhou, Y.[You],
Zhao, K.[Kai],
Scene classification for remote sensing images with self-attention
augmented CNN,
IET-IPR(16), No. 11, 2022, pp. 3085-3096.
DOI Link
2208
BibRef
Wu, X.X.[Xin-Xiao],
Zhao, W.T.[Wen-Tian],
Luo, J.B.[Jie-Bo],
Learning Cooperative Neural Modules for Stylized Image Captioning,
IJCV(130), No. 9, September 2022, pp. 2305-2320.
Springer DOI
2208
BibRef
Zhou, H.[Haonan],
Du, X.P.[Xiao-Ping],
Xia, L.[Lurui],
Li, S.[Sen],
Self-Learning for Few-Shot Remote Sensing Image Captioning,
RS(14), No. 18, 2022, pp. xx-yy.
DOI Link
2209
BibRef
Kim, D.J.[Dong-Jin],
Oh, T.H.[Tae-Hyun],
Choi, J.[Jinsoo],
Kweon, I.S.[In So],
Dense Relational Image Captioning via Multi-Task Triple-Stream
Networks,
PAMI(44), No. 11, November 2022, pp. 7348-7362.
IEEE DOI
2210
BibRef
Earlier: A1, A3, A2, A4:
Dense Relational Captioning: Triple-Stream Networks for
Relationship-Based Captioning,
CVPR19(6264-6273).
IEEE DOI
2002
Task analysis, Visualization, Proposals, Dogs, Motorcycles,
Natural languages, Genomics, Dense captioning, image captioning,
scene graph
BibRef
Cao, S.[Shan],
An, G.[Gaoyun],
Zheng, Z.X.[Zhen-Xing],
Wang, Z.Y.[Zhi-Yong],
Vision-Enhanced and Consensus-Aware Transformer for Image Captioning,
CirSysVideo(32), No. 10, October 2022, pp. 7005-7018.
IEEE DOI
2210
Transformers, Visualization, Decoding, Semantics, Task analysis,
Convolution, Visual perception, Image captioning,
consensus knowledge
BibRef
Nguyen, T.S.[Thanh-Son],
Fernando, B.[Basura],
Effective Multimodal Encoding for Image Paragraph Captioning,
IP(31), 2022, pp. 6381-6395.
IEEE DOI
2211
Image coding, Visualization, Encoding, Generators, Training,
Image reconstruction, Decoding, Multimodal encoding generation, autoencoder
BibRef
Jiang, W.T.[Wei-Tao],
Zhou, W.[Wei],
Hu, H.F.[Hai-Feng],
Double-Stream Position Learning Transformer Network for Image
Captioning,
CirSysVideo(32), No. 11, November 2022, pp. 7706-7718.
IEEE DOI
2211
Transformers, Feature extraction, Visualization, Decoding,
Convolutional neural networks, Task analysis, Semantics, attention mechanism
BibRef
Stefanini, M.[Matteo],
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cascianelli, S.[Silvia],
Fiameni, G.[Giuseppe],
Cucchiara, R.[Rita],
From Show to Tell: A Survey on Deep Learning-Based Image Captioning,
PAMI(45), No. 1, January 2023, pp. 539-559.
IEEE DOI
2212
Survey, Image Captions. Visualization, Feature extraction, Task analysis,
Convolutional neural networks, Additives, Image coding, Training
BibRef
Wu, Y.[Yu],
Jiang, L.[Lu],
Yang, Y.[Yi],
Switchable Novel Object Captioner,
PAMI(45), No. 1, January 2023, pp. 1162-1173.
IEEE DOI
2212
Training, Visualization, Switches, Task analysis, Training data,
Decoding, Convolutional neural networks, Image captioning, zero-shot learning
BibRef
Hu, J.T.[Jun-Tao],
Yang, Y.[You],
Yao, L.[Lu],
An, Y.Z.[Yong-Zhi],
Pan, L.[Longyue],
Position-guided transformer for image captioning,
IVC(128), 2022, pp. 104575.
Elsevier DOI
2212
Image captioning, Bi-positional attention, Position encoding,
Group normalization, Transformer, Self-attention
BibRef
Wang, Z.G.[Zhon-Gan],
Shi, S.[Shuai],
Zhai, Z.R.[Zi-Rong],
Wu, Y.[Yingna],
Yang, R.[Rui],
ArCo: Attention-reinforced transformer with contrastive learning for
image captioning,
IVC(128), 2022, pp. 104570.
Elsevier DOI
2212
Image captioning, Visual attention, Transformer, Contrastive learning
BibRef
Hochberg, D.C.[Dana Cohen],
Greenspan, H.[Hayit],
Giryes, R.[Raja],
A Self Supervised StyleGAN for Image Annotation and Classification
With Extremely Limited Labels,
MedImg(41), No. 12, December 2022, pp. 3509-3519.
IEEE DOI
2212
Task analysis, Training, Generators, Biomedical imaging,
Self-supervised learning, Annotations, Aerospace electronics,
representative selection
BibRef
Yang, X.[Xu],
Zhang, H.W.[Han-Wang],
Gao, C.Y.[Chong-Yang],
Cai, J.F.[Jian-Fei],
Learning to Collocate Visual-Linguistic Neural Modules for Image
Captioning,
IJCV(131), No. 1, January 2023, pp. 82-100.
Springer DOI
2301
BibRef
Earlier: A1, A2, A4, Only:
Learning to Collocate Neural Modules for Image Captioning,
ICCV19(4249-4259)
IEEE DOI
2004
image processing, learning (artificial intelligence),
natural language processing, neural nets, Neural networks
BibRef
Li, Z.X.[Zhi-Xin],
Wei, J.[Jiahui],
Huang, F.C.[Fei-Cheng],
Ma, H.F.[Hui-Fang],
Modeling graph-structured contexts for image captioning,
IVC(129), 2023, pp. 104591.
Elsevier DOI
2301
Image captioning, Transformer, Scene graph,
Reinforcement learning, Attention mechanism
BibRef
Wang, J.[Jiuniu],
Xu, W.J.[Wen-Jia],
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
On Distinctive Image Captioning via Comparing and Reweighting,
PAMI(45), No. 2, February 2023, pp. 2088-2103.
IEEE DOI
2301
Training, Measurement, Annotations, Semantics,
Maximum likelihood estimation, Xenon, Web and internet services,
metric
BibRef
Duan, Y.Q.[Yi-Qun],
Wang, Z.[Zhen],
Li, Y.[Yi],
Wang, J.Y.[Jing-Ya],
Cross-domain multi-style merge for image captioning,
CVIU(228), 2023, pp. 103617.
Elsevier DOI
2302
Vision and language, Image captioning, Controllable generation
BibRef
Wu, X.X.[Xin-Xiao],
Li, T.[Tong],
Sentimental Visual Captioning using Multimodal Transformer,
IJCV(131), No. 1, January 2023, pp. 1073-1090.
Springer DOI
2303
BibRef
Ma, Y.[Yiwei],
Ji, J.Y.[Jia-Yi],
Sun, X.S.[Xiao-Shuai],
Zhou, Y.[Yiyi],
Ji, R.R.[Rong-Rong],
Towards local visual modeling for image captioning,
PR(138), 2023, pp. 109420.
Elsevier DOI
2303
Image captioning, Attention mechanism, Local visual modeling
BibRef
Barati, A.[Alireza],
Farsi, H.[Hassan],
Mohamadzadeh, S.[Sajad],
Integration of the latent variable knowledge into deep image
captioning with Bayesian modeling,
IET-IPR(17), No. 7, 2023, pp. 2256-2271.
DOI Link
2305
attention mechanism, automatic image captioning,
deep neural networks, high-level semantic concepts, latent variable
BibRef
Feng, J.L.[Jun-Long],
Zhao, J.P.[Jian-Ping],
Effectively Utilizing the Category Labels for Image Captioning,
IEICE(E106-D), No. 5, May 2023, pp. 617-624.
WWW Link.
2305
BibRef
Zhang, Y.[Youyuan],
Wang, J.[Jiuniu],
Wu, H.[Hao],
Xu, W.J.[Wen-Jia],
Distinctive Image Captioning via Clip Guided Group Optimization,
CMHRI22(223-238).
Springer DOI
2304
BibRef
Wang, T.J.J.[Tzu-Jui Julius],
Laaksonen, J.[Jorma],
Langer, T.[Tomas],
Arponen, H.[Heikki],
Bishop, T.E.[Tom E.],
Learning by Hallucinating:
Vision-Language Pre-training with Weak Supervision,
WACV23(1073-1083)
IEEE DOI
2302
Visualization, Vocabulary, Computational modeling, Detectors,
Benchmark testing, Transformers, un-supervised learning
BibRef
Qiu, Y.[Yue],
Yamamoto, S.[Shintaro],
Yamada, R.[Ryosuke],
Suzuki, R.[Ryota],
Kataoka, H.[Hirokatsu],
Iwata, K.[Kenji],
Satoh, Y.[Yutaka],
3D Change Localization and Captioning from Dynamic Scans of Indoor
Scenes,
WACV23(1176-1185)
IEEE DOI
2302
Location awareness, Point cloud compression, Image recognition,
Limiting, Detectors, Benchmark testing, 3D computer vision
BibRef
Honda, U.[Ukyo],
Watanabe, T.[Taro],
Matsumoto, Y.[Yuji],
Switching to Discriminative Image Captioning by Relieving a
Bottleneck of Reinforcement Learning,
WACV23(1124-1134)
IEEE DOI
2302
Vocabulary, Limiting, Computational modeling, Switches,
Reinforcement learning, Control systems,
visual reasoning
BibRef
Sui, J.H.[Jia-Hong],
Yu, H.M.[Hui-Min],
Liang, X.Y.[Xin-Yue],
Ping, P.[Ping],
Image Caption Method Based on Graph Attention Network with Global
Context,
ICIVC22(480-487)
IEEE DOI
2301
Deep learning, Visualization, Image coding, Semantics,
Neural networks, Image representation, Feature extraction, global feature
BibRef
Barraco, M.[Manuele],
Stefanini, M.[Matteo],
Cornia, M.[Marcella],
Cascianelli, S.[Silvia],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
CaMEL: Mean Teacher Learning for Image Captioning,
ICPR22(4087-4094)
IEEE DOI
2212
Training, Measurement, Knowledge engineering, Visualization,
Source coding, Natural languages, Feature extraction
BibRef
Lou, L.S.[Liang-Shan],
Lu, K.[Ke],
Xue, J.[Jian],
Improved Transformer with Parallel Encoders for Image Captioning,
ICPR22(4072-4075)
IEEE DOI
2212
Measurement, Fuses, Transformers, Decoding, Task analysis
BibRef
Wang, Y.H.[Ye-Huan],
Shang, L.[Lin],
Generating Spatial-aware Captions for TextCaps,
ICPR22(379-385)
IEEE DOI
2212
Visualization, Analytical models, Head, Optical character recognition,
Transformer cores, Transformers
BibRef
Feng, Y.[Yuhu],
Maeda, K.[Keisuke],
Ogawa, T.[Takahiro],
Haseyama, M.[Miki],
Human-Centric Image Retrieval with Gaze-Based Image Captioning,
ICIP22(3828-3832)
IEEE DOI
2211
Image retrieval, Semantics, Focusing, Transformers, Gaze trace,
transformer, human-centric, cross-modal retrieval, image captioning
BibRef
Arguello, P.[Paula],
Lopez, J.[Jhon],
Hinojosa, C.[Carlos],
Arguello, H.[Henry],
Optics Lens Design for Privacy-Preserving Scene Captioning,
ICIP22(3551-3555)
IEEE DOI
2211
Integrated optics, Privacy, Optical design, Optical distortion,
Optical detectors, Optical imaging, Feature extraction,
Computational Optics
BibRef
Yang, X.[Xin],
Wang, Y.[Ying],
Chen, H.[Haishun],
Li, J.[Jie],
CSTNET: Enhancing Global-To-Local Interactions for Image Captioning,
ICIP22(1861-1865)
IEEE DOI
2211
Neural networks, Transformers, Task analysis, Context modeling,
Image captioning, Gate mechanism, Vision transformer, Deep Neural Network
BibRef
Hu, W.Z.[Wen-Zhe],
Wang, L.[Lanxiao],
Xu, L.[Linfeng],
Spatial-Semantic Attention for Grounded Image Captioning,
ICIP22(61-65)
IEEE DOI
2211
Measurement, Grounding, Semantics, Predictive models,
Feature extraction, Data mining, Proposals,
Multimodal
BibRef
Meng, Z.[Zihang],
Yang, D.[David],
Cao, X.F.[Xue-Fei],
Shah, A.[Ashish],
Lim, S.N.[Ser-Nam],
Object-Centric Unsupervised Image Captioning,
ECCV22(XXXVI:219-235).
Springer DOI
2211
BibRef
Nguyen, V.Q.[Van-Quang],
Suganuma, M.[Masanori],
Okatani, T.[Takayuki],
GRIT: Faster and Better Image Captioning Transformer Using Dual Visual
Features,
ECCV22(XXXVI:167-184).
Springer DOI
2211
BibRef
Wang, Z.[Zhen],
Chen, L.[Long],
Ma, W.[Wenbo],
Han, G.X.[Guang-Xing],
Niu, Y.[Yulei],
Shao, J.[Jian],
Xiao, J.[Jun],
Explicit Image Caption Editing,
ECCV22(XXXVI:113-129).
Springer DOI
2211
BibRef
Jiao, Y.[Yang],
Chen, S.X.[Shao-Xiang],
Jie, Z.[Zequn],
Chen, J.J.[Jing-Jing],
Ma, L.[Lin],
Jiang, Y.G.[Yu-Gang],
MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes,
ECCV22(XXXV:528-545).
Springer DOI
2211
BibRef
Nagrani, A.[Arsha],
Seo, P.H.[Paul Hongsuck],
Seybold, B.[Bryan],
Hauth, A.[Anja],
Manen, S.[Santiago],
Sun, C.[Chen],
Schmid, C.[Cordelia],
Learning Audio-Video Modalities from Image Captions,
ECCV22(XIV:407-426).
Springer DOI
2211
BibRef
Tewel, Y.[Yoad],
Shalev, Y.[Yoav],
Schwartz, I.[Idan],
Wolf, L.B.[Lior B.],
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
Arithmetic,
CVPR22(17897-17907)
IEEE DOI
2210
Knowledge engineering, Training, Measurement, Visualization,
Text recognition, Semantics, Magnetic heads,
Vision+language, Transfer/low-shot/long-tail learning
BibRef
Truong, P.[Prune],
Danelljan, M.[Martin],
Yu, F.[Fisher],
Van Gool, L.J.[Luc J.],
Probabilistic Warp Consistency for Weakly-Supervised Semantic
Correspondences,
CVPR22(8698-8708)
IEEE DOI
2210
Image resolution, Costs, Semantics, Computer architecture,
Benchmark testing, Probabilistic logic, Motion and tracking, retrieval
BibRef
Chan, D.M.[David M.],
Myers, A.[Austin],
Vijayanarasimhan, S.[Sudheendra],
Ross, D.A.[David A.],
Seybold, B.[Bryan],
Canny, J.F.[John F.],
What's in a Caption? Dataset-Specific Linguistic Diversity and Its
Effect on Visual Description Models and Metrics,
VDU22(4739-4748)
IEEE DOI
2210
Measurement, Visualization, Analytical models, Video description,
Computational modeling, Training data, Linguistics
BibRef
Popattia, M.[Murad],
Rafi, M.[Muhammad],
Qureshi, R.[Rizwan],
Nawaz, S.[Shah],
Guiding Attention using Partial-Order Relationships for Image
Captioning,
MULA22(4670-4679)
IEEE DOI
2210
Training, Measurement, Visualization, Semantics, Computer architecture
BibRef
Barraco, M.[Manuele],
Cornia, M.[Marcella],
Cascianelli, S.[Silvia],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
The Unreasonable Effectiveness of CLIP Features for Image Captioning:
An Experimental Analysis,
MULA22(4661-4669)
IEEE DOI
2210
Visualization, Protocols, Detectors,
Distance measurement, Data models
BibRef
Mohamed, Y.[Youssef],
Khan, F.F.[Faizan Farooq],
Haydarov, K.[Kilichbek],
Elhoseiny, M.[Mohamed],
It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective
Image Captioning by Contrastive Data Collection,
CVPR22(21231-21240)
IEEE DOI
2210
Measurement, Codes, Human intelligence, Data collection, Data models,
Pattern recognition, Datasets and evaluation, Others, Vision + language
BibRef
Chen, J.[Jun],
Guo, H.[Han],
Yi, K.[Kai],
Li, B.Y.[Bo-Yang],
Elhoseiny, M.[Mohamed],
VisualGPT: Data-efficient Adaptation of Pretrained Language Models
for Image Captioning,
CVPR22(18009-18019)
IEEE DOI
2210
Training, Representation learning, Adaptation models,
Visualization, Computational modeling, Semantics, Linguistics,
Transfer/low-shot/long-tail learning
BibRef
Chen, S.[Simin],
Song, Z.[Zihe],
Haque, M.[Mirazul],
Liu, C.[Cong],
Yang, W.[Wei],
NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image
Caption Generation Models,
CVPR22(15344-15353)
IEEE DOI
2210
Visualization, Computational modeling, Perturbation methods,
Robustness, Real-time systems, Pattern recognition,
Efficient learning and inferences
BibRef
Hirota, Y.[Yusuke],
Nakashima, Y.[Yuta],
Garcia, N.[Noa],
Quantifying Societal Bias Amplification in Image Captioning,
CVPR22(13440-13449)
IEEE DOI
2210
Measurement, Equalizers, Computational modeling, Focusing,
Predictive models, Skin, Transparency, fairness, accountability,
Vision + language
BibRef
Beddiar, D.[Djamila],
Oussalah, M.[Mourad],
Tapio, S.[Seppänen],
Explainability for Medical Image Captioning,
IPTA22(1-6)
IEEE DOI
2206
Visualization, Computational modeling, Semantics,
Feature extraction, Decoding, Convolutional neural networks,
Artificial Intelligence Explainability
BibRef
Bounab, Y.[Yazid],
Oussalah, M.[Mourad],
Ferdenache, A.[Ahlam],
Reconciling Image Captioning and User's Comments for Urban Tourism,
IPTA20(1-6)
IEEE DOI
2206
Visualization, Databases, Tourism industry, Pipelines, Tools, Internet,
Planning, Image captioning, social media, image description,
google vision API
BibRef
Zha, Z.W.[Zhi-Wei],
Zhou, P.F.[Peng-Fei],
Bai, C.[Cong],
Exploring Implicit and Explicit Relations with the Dual Relation-Aware
Network for Image Captioning,
MMMod22(II:97-108).
Springer DOI
2203
BibRef
Ruta, D.[Dan],
Motiian, S.[Saeid],
Faieta, B.[Baldo],
Lin, Z.[Zhe],
Jin, H.L.[Hai-Lin],
Filipkowski, A.[Alex],
Gilbert, A.[Andrew],
Collomosse, J.[John],
ALADIN: All Layer Adaptive Instance Normalization for Fine-grained
Style Similarity,
ICCV21(11906-11915)
IEEE DOI
2203
Training, Representation learning, Visualization,
Adaptation models, User-generated content, Neural generative models
BibRef
Nguyen, K.[Kien],
Tripathi, S.[Subarna],
Du, B.[Bang],
Guha, T.[Tanaya],
Nguyen, T.Q.[Truong Q.],
In Defense of Scene Graphs for Image Captioning,
ICCV21(1387-1396)
IEEE DOI
2203
Convolutional codes, Visualization, Image coding, Semantics,
Pipelines, Generators, Vision + language, Scene analysis and understanding
BibRef
Shi, J.[Jiahe],
Li, Y.[Yali],
Wang, S.J.[Sheng-Jin],
Partial Off-policy Learning: Balance Accuracy and Diversity for
Human-Oriented Image Captioning,
ICCV21(2167-2176)
IEEE DOI
2203
Correlation, Computational modeling, Reinforcement learning,
Generative adversarial networks, Task analysis,
BibRef
Alahmadi, R.[Rehab],
Hahn, J.[James],
Improve Image Captioning by Estimating the Gazing Patterns from the
Caption,
WACV22(2453-2462)
IEEE DOI
2202
Visualization, Computational modeling,
Neural networks, Feature extraction,
Vision and Languages Scene Understanding
BibRef
Biten, A.F.[Ali Furkan],
Gómez, L.[Lluís],
Karatzas, D.[Dimosthenis],
Let there be a clock on the beach:
Reducing Object Hallucination in Image Captioning,
WACV22(2473-2482)
IEEE DOI
2202
Measurement, Training, Visualization, Analytical models,
Computational modeling, Training data, Vision and Languages
BibRef
Deb, T.[Tonmoay],
Sadmanee, A.[Akib],
Bhaumik, K.K.[Kishor Kumar],
Ali, A.A.[Amin Ahsan],
Amin, M.A.[M Ashraful],
Rahman, A.K.M.M.[A.K.M. Mahbubur],
Variational Stacked Local Attention Networks for Diverse Video
Captioning,
WACV22(2493-2502)
IEEE DOI
2202
Measurement, Visualization, Stacking, Redundancy, Natural languages,
Streaming media, Syntactics, Vision and Languages Datasets,
Analysis and Understanding
BibRef
Lahtinen, T.[Tuomo],
Turtiainen, H.[Hannu],
Costin, A.[Andrei],
Brima: Low-Overhead Browser-Only Image Annotation Tool (Preprint),
ICIP21(2633-2637)
IEEE DOI
2201
Tool for interactive annotation.
BRowser-only IMage Annotation.
Training, Annotations, Image annotation, Documentation, Tools,
Browsers, Image Annotation, Annotation Tool, COCO
BibRef
Sharif, N.[Naeha],
White, L.[Lyndon],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
WEmbSim: A Simple yet Effective Metric for Image Captioning,
DICTA20(1-8)
IEEE DOI
2201
Measurement, Correlation, Databases, Digital images,
Machine learning, SPICE, Task analysis, Image Captioning, Word Embeddings
BibRef
Lotfi, F.[Fariba],
Jamzad, M.[Mansour],
Beigy, H.[Hamid],
Automatic Image Annotation using Tag Relations and Graph
Convolutional Networks,
IPRIA21(1-6)
IEEE DOI
2201
Vocabulary, Visualization, Image analysis, Convolution, Annotations,
Image annotation, Games, Automatic image annotation, deep learning,
vocabulary graph
BibRef
Qiu, J.Y.[Jia-Yan],
Yang, Y.D.[Yi-Ding],
Wang, X.[Xinchao],
Tao, D.C.[Da-Cheng],
Scene Essence,
CVPR21(8318-8329)
IEEE DOI
2111
Image recognition, Graph neural networks,
Pattern recognition, Labeling, Lenses
BibRef
Hosseinzadeh, M.[Mehrdad],
Wang, Y.[Yang],
Image Change Captioning by Learning from an Auxiliary Task,
CVPR21(2724-2733)
IEEE DOI
2111
Training, Image color analysis, Image retrieval,
Semantics, Benchmark testing, Pattern recognition
BibRef
Chen, L.[Long],
Jiang, Z.H.[Zhi-Hong],
Xiao, J.[Jun],
Liu, W.[Wei],
Human-like Controllable Image Captioning with Verb-specific Semantic
Roles,
CVPR21(16841-16851)
IEEE DOI
2111
Visualization, Codes, Semantics, Benchmark testing,
Controllability, Pattern recognition
BibRef
Xu, G.H.[Guang-Hui],
Niu, S.C.[Shuai-Cheng],
Tan, M.K.[Ming-Kui],
Luo, Y.C.[Yu-Cheng],
Du, Q.[Qing],
Wu, Q.[Qi],
Towards Accurate Text-based Image Captioning with Content Diversity
Exploration,
CVPR21(12632-12641)
IEEE DOI
2111
Visualization, Image resolution, Benchmark testing,
Pattern recognition, Proposals, Optical character recognition software
BibRef
Chen, D.Z.Y.[Dave Zhen-Yu],
Gholami, A.[Ali],
Nießner, M.[Matthias],
Chang, A.X.[Angel X.],
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans,
CVPR21(3192-3202)
IEEE DOI
2111
Location awareness, Message passing,
Natural languages, Pipelines, Computer architecture, Object detection
BibRef
Luong, Q.A.[Quoc-An],
Vo, D.M.[Duc Minh],
Sugimoto, A.[Akihiro],
Saliency based Subject Selection for Diverse Image Captioning,
MVA21(1-5)
DOI Link
2109
Measurement, Visualization, Diversity methods
BibRef
Sharif, N.[Naeha],
Bennamoun, M.[Mohammed],
Liu, W.[Wei],
Shah, S.A.A.[Syed Afaq Ali],
SubICap: Towards Subword-informed Image Captioning,
WACV21(3539-3540)
IEEE DOI
2106
Measurement, Training, Vocabulary, Image segmentation,
Image color analysis, Computational modeling, Semantics
BibRef
Chen, X.Y.[Xian-Yu],
Jiang, M.[Ming],
Zhao, Q.[Qi],
Self-Distillation for Few-Shot Image Captioning,
WACV21(545-555)
IEEE DOI
2106
Annotations, Training data, Manuals,
Data models, Task analysis
BibRef
Umemura, K.[Kazuki],
Kastner, M.A.[Marc A.],
Ide, I.[Ichiro],
Kawanishi, Y.[Yasutomo],
Hirayama, T.[Takatsugu],
Doman, K.[Keisuke],
Deguchi, D.[Daisuke],
Murase, H.[Hiroshi],
Tell as You Imagine: Sentence Imageability-aware Image Captioning,
MMMod21(II:62-73).
Springer DOI
2106
BibRef
Hallonquist, N.[Neil],
German, D.[Donald],
Younes, L.[Laurent],
Graph Discovery for Visual Test Generation,
ICPR21(7500-7507)
IEEE DOI
2105
Visualization, Vocabulary, Machine vision, Semantics,
Image representation, Knowledge discovery, Probability distribution
BibRef
Li, X.J.[Xin-Jie],
Yang, C.[Chun],
Chen, S.L.[Song-Lu],
Zhu, C.[Chao],
Yin, X.C.[Xu-Cheng],
Semantic Bilinear Pooling for Fine-Grained Recognition,
ICPR21(3660-3666)
IEEE DOI
2105
Training, Deep learning, Semantics, Birds, Pattern recognition,
Testing, Semantic Information, Bilinear Pooling, Fine-Grained Recognition
BibRef
Chavhan, R.[Ruchika],
Banerjee, B.[Biplab],
Zhu, X.X.[Xiao Xiang],
Chaudhuri, S.[Subhasis],
A Novel Actor Dual-Critic Model for Remote Sensing Image Captioning,
ICPR21(4918-4925)
IEEE DOI
2105
Training, Image coding, Reinforcement learning, Gain measurement,
Benchmark testing, Optical imaging, Data models
BibRef
Kalimuthu, M.[Marimuthu],
Mogadala, A.[Aditya],
Mosbach, M.[Marius],
Klakow, D.[Dietrich],
Fusion Models for Improved Image Captioning,
MMDLCA20(381-395).
Springer DOI
2103
BibRef
Cetinic, E.[Eva],
Iconographic Image Captioning for Artworks,
FAPER20(502-516).
Springer DOI
2103
BibRef
Huang, Y.Q.[Yi-Qing],
Chen, J.S.[Jian-Sheng],
Show, Conceive and Tell: Image Captioning with Prospective Linguistic
Information,
ACCV20(VI:478-494).
Springer DOI
2103
BibRef
Deng, C.R.[Chao-Rui],
Ding, N.[Ning],
Tan, M.K.[Ming-Kui],
Wu, Q.[Qi],
Length-controllable Image Captioning,
ECCV20(XIII:712-729).
Springer DOI
2011
BibRef
Gurari, D.[Danna],
Zhao, Y.N.[Yi-Nan],
Zhang, M.[Meng],
Bhattacharya, N.[Nilavra],
Captioning Images Taken by People Who Are Blind,
ECCV20(XVII:417-434).
Springer DOI
2011
BibRef
Jiu, M.,
Sahbi, H.,
End-to-End Deep Kernel Map Design for Image Annotation,
ICIP20(1546-1550)
IEEE DOI
2011
Kernel, Task analysis, Training, Image annotation, Neural networks,
Training data, Supervised learning, Deep kernel networks,
image annotation
BibRef
Zhong, Y.W.[Yi-Wu],
Wang, L.W.[Li-Wei],
Chen, J.S.[Jian-Shu],
Yu, D.[Dong],
Li, Y.[Yin],
Comprehensive Image Captioning via Scene Graph Decomposition,
ECCV20(XIV:211-229).
Springer DOI
2011
BibRef
Wang, Z.[Zeyu],
Feng, B.[Berthy],
Narasimhan, K.[Karthik],
Russakovsky, O.[Olga],
Towards Unique and Informative Captioning of Images,
ECCV20(VII:629-644).
Springer DOI
2011
BibRef
Sidorov, O.[Oleksii],
Hu, R.H.[Rong-Hang],
Rohrbach, M.[Marcus],
Singh, A.[Amanpreet],
Textcaps: A Dataset for Image Captioning with Reading Comprehension,
ECCV20(II:742-758).
Springer DOI
2011
BibRef
Durand, T.[Thibaut],
Learning User Representations for Open Vocabulary Image Hashtag
Prediction,
CVPR20(9766-9775)
IEEE DOI
2008
Tagging, Twitter, Computational modeling, Vocabulary,
Predictive models, History, Visualization
BibRef
Prabhudesai, M.[Mihir],
Tung, H.Y.F.[Hsiao-Yu Fish],
Javed, S.A.[Syed Ashar],
Sieb, M.[Maximilian],
Harley, A.W.[Adam W.],
Fragkiadaki, K.[Katerina],
Embodied Language Grounding With 3D Visual Feature Representations,
CVPR20(2217-2226)
IEEE DOI
2008
Associating language utterances to 3D visual abstractions.
Visualization,
Cameras, Feature extraction, Detectors, Solid modeling
BibRef
Li, Z.,
Tran, Q.,
Mai, L.,
Lin, Z.,
Yuille, A.L.,
Context-Aware Group Captioning via Self-Attention and Contrastive
Features,
CVPR20(3437-3447)
IEEE DOI
2008
Task analysis, Visualization, Context modeling,
Training, Natural languages, Computational modeling
BibRef
Zhou, Y.,
Wang, M.,
Liu, D.,
Hu, Z.,
Zhang, H.,
More Grounded Image Captioning by Distilling Image-Text Matching
Model,
CVPR20(4776-4785)
IEEE DOI
2008
Visualization, Grounding, Task analysis, Training, Measurement,
Computational modeling, Image edge detection
BibRef
Sammani, F.,
Melas-Kyriazi, L.,
Show, Edit and Tell: A Framework for Editing Image Captions,
CVPR20(4807-4815)
IEEE DOI
2008
Decoding, Visualization, Task analysis, Logic gates,
Natural languages, Adaptation models, Glass
BibRef
Chen, S.,
Jin, Q.,
Wang, P.,
Wu, Q.,
Say As You Wish: Fine-Grained Control of Image Caption Generation
With Abstract Scene Graphs,
CVPR20(9959-9968)
IEEE DOI
2008
Semantics, Decoding, Visualization, Feature extraction,
Controllability, Task analysis, Measurement
BibRef
Guo, L.,
Liu, J.,
Zhu, X.,
Yao, P.,
Lu, S.,
Lu, H.,
Normalized and Geometry-Aware Self-Attention Network for Image
Captioning,
CVPR20(10324-10333)
IEEE DOI
2008
Geometry, Task analysis, Visualization, Decoding, Training,
Feature extraction, Computer architecture
BibRef
Chen, J.,
Jin, Q.,
Better Captioning With Sequence-Level Exploration,
CVPR20(10887-10896)
IEEE DOI
2008
Task analysis, Measurement, Training, Computational modeling,
Computer architecture, Portable computers, Decoding
BibRef
Pan, Y.,
Yao, T.,
Li, Y.,
Mei, T.,
X-Linear Attention Networks for Image Captioning,
CVPR20(10968-10977)
IEEE DOI
2008
Visualization, Decoding, Cognition, Knowledge discovery,
Task analysis, Aggregates, Weight measurement
BibRef
Tran, A.,
Mathews, A.,
Xie, L.,
Transform and Tell: Entity-Aware News Image Captioning,
CVPR20(13032-13042)
IEEE DOI
2008
Decoding, Vocabulary, Transforms, Linguistics, Performance gain,
Neural networks, Training
BibRef
Park, G.[Geondo],
Han, C.[Chihye],
Kim, D.[Daeshik],
Yoon, W.J.[Won-Jun],
MHSAN: Multi-Head Self-Attention Network for Visual Semantic
Embedding,
WACV20(1507-1515)
IEEE DOI
2006
Feature extraction, Visualization, Semantics, Task analysis,
Recurrent neural networks, Image representation, Image coding
BibRef
Chen, C.,
Zhang, R.,
Koh, E.,
Kim, S.,
Cohen, S.,
Rossi, R.,
Figure Captioning with Relation Maps for Reasoning,
WACV20(1526-1534)
IEEE DOI
2006
Bars, Training, Visualization, Decoding, Computational modeling,
Task analysis, Portable document format
BibRef
He, S.,
Tavakoli, H.R.,
Borji, A.,
Pugeault, N.,
Human Attention in Image Captioning: Dataset and Analysis,
ICCV19(8528-8537)
IEEE DOI
2004
Code, Captioning.
WWW Link. convolutional neural nets, image segmentation,
natural language processing, object detection, visual perception,
Adaptation models
BibRef
Huang, L.,
Wang, W.,
Chen, J.,
Wei, X.,
Attention on Attention for Image Captioning,
ICCV19(4633-4642)
IEEE DOI
2004
Code, Captioning.
WWW Link. decoding, encoding, image processing, natural language processing,
element-wise multiplication, image captioning, weighted average,
Testing
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Hierarchy Parsing for Image Captioning,
ICCV19(2621-2629)
IEEE DOI
2004
convolutional neural nets, feature extraction, image coding,
image representation, image segmentation, Image segmentation
BibRef
Liu, L.,
Tang, J.,
Wan, X.,
Guo, Z.,
Generating Diverse and Descriptive Image Captions Using Visual
Paraphrases,
ICCV19(4239-4248)
IEEE DOI
2004
image classification,
learning (artificial intelligence), Machine learning
BibRef
Ke, L.,
Pei, W.,
Li, R.,
Shen, X.,
Tai, Y.,
Reflective Decoding Network for Image Captioning,
ICCV19(8887-8896)
IEEE DOI
2004
decoding, encoding, feature extraction,
learning (artificial intelligence), Random access memory
BibRef
Vered, G.,
Oren, G.,
Atzmon, Y.,
Chechik, G.,
Joint Optimization for Cooperative Image Captioning,
ICCV19(8897-8906)
IEEE DOI
2004
gradient methods, image sampling, natural language processing,
stochastic programming, text analysis, Loss measurement
BibRef
Ge, H.,
Yan, Z.,
Zhang, K.,
Zhao, M.,
Sun, L.,
Exploring Overall Contextual Information for Image Captioning in
Human-Like Cognitive Style,
ICCV19(1754-1763)
IEEE DOI
2004
cognition, computational linguistics,
learning (artificial intelligence), Cognition
BibRef
Agrawal, H.,
Desai, K.,
Wang, Y.,
Chen, X.,
Jain, R.,
Johnson, M.,
Batra, D.,
Parikh, D.,
Lee, S.,
Anderson, P.,
nocaps: novel object captioning at scale,
ICCV19(8947-8956)
IEEE DOI
2004
feature extraction,
learning (artificial intelligence), object detection, Vegetation
BibRef
Hu, H.,
Misra, I.,
van der Maaten, L.,
Evaluating Text-to-Image Matching using Binary Image Selection
(BISON),
CLVL19(1887-1890)
IEEE DOI
2004
content-based retrieval, image annotation,
image matching, image retrieval, text analysis, linguistic content,
Image Captioning
BibRef
Nguyen, A.,
Tran, Q.D.,
Do, T.,
Reid, I.,
Caldwell, D.G.,
Tsagarakis, N.G.,
Object Captioning and Retrieval with Natural Language,
ACVR19(2584-2592)
IEEE DOI
2004
convolutional neural nets, image retrieval,
learning (artificial intelligence), vision and language
BibRef
Gu, J.,
Joty, S.,
Cai, J.,
Zhao, H.,
Yang, X.,
Wang, G.,
Unpaired Image Captioning via Scene Graph Alignments,
ICCV19(10322-10331)
IEEE DOI
2004
graph theory, image representation, image retrieval,
natural language processing, text analysis, Encoding
BibRef
Shen, T.,
Kar, A.,
Fidler, S.,
Learning to Caption Images Through a Lifetime by Asking Questions,
ICCV19(10392-10401)
IEEE DOI
2004
image retrieval, multi-agent systems,
natural language processing, Automobiles
BibRef
Tanaka, M.,
Itamochi, T.,
Narioka, K.,
Sato, I.,
Ushiku, Y.,
Harada, T.,
Generating Easy-to-Understand Referring Expressions for Target
Identifications,
ICCV19(5793-5802)
IEEE DOI
2004
Code, Annotation.
WWW Link. computer games, image processing, referred objects,
salient contexts, human annotation, Grand Theft Auto V,
Task analysis
BibRef
Aneja, J.[Jyoti],
Agrawal, H.[Harsh],
Batra, D.[Dhruv],
Schwing, A.G.[Alexander G.],
Sequential Latent Spaces for Modeling the Intention During Diverse
Image Captioning,
ICCV19(4260-4269)
IEEE DOI
2004
image retrieval, image segmentation,
learning (artificial intelligence), recurrent neural nets, Controllability
BibRef
Gupta, T.,
Schwing, A.G.,
Hoiem, D.,
ViCo: Word Embeddings From Visual Co-Occurrences,
ICCV19(7424-7433)
IEEE DOI
2004
feature extraction, image annotation, image classification,
pattern clustering, supervised learning, text analysis,
Vocabulary
BibRef
Deshpande, A.[Aditya],
Aneja, J.[Jyoti],
Wang, L.W.[Li-Wei],
Schwing, A.G.[Alexander G.],
Forsyth, D.A.[David A.],
Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech,
CVPR19(10687-10696).
IEEE DOI
2002
BibRef
Wei, H.Y.[Hai-Yang],
Li, Z.X.[Zhi-Xin],
Zhang, C.L.[Can-Long],
Image Captioning Based on Visual and Semantic Attention,
MMMod20(I:151-162).
Springer DOI
2003
BibRef
Dognin, P.[Pierre],
Melnyk, I.[Igor],
Mroueh, Y.[Youssef],
Ross, J.[Jerret],
Sercu, T.[Tom],
Adversarial Semantic Alignment for Improved Image Captions,
CVPR19(10455-10463).
IEEE DOI
2002
BibRef
Fukui, H.[Hiroshi],
Hirakawa, T.[Tsubasa],
Yamashita, T.[Takayoshi],
Fujiyoshi, H.[Hironobu],
Attention Branch Network: Learning of Attention Mechanism for Visual
Explanation,
CVPR19(10697-10706).
IEEE DOI
2002
BibRef
Biten, A.F.[Ali Furkan],
Gomez, L.[Lluis],
Rusinol, M.[Marcal],
Karatzas, D.[Dimosthenis],
Good News, Everyone! Context Driven Entity-Aware Captioning for News
Images,
CVPR19(12458-12467).
IEEE DOI
2002
BibRef
Surís, D.[Dídac],
Epstein, D.[Dave],
Ji, H.[Heng],
Chang, S.F.[Shih-Fu],
Vondrick, C.[Carl],
Learning to Learn Words from Visual Scenes,
ECCV20(XXIX: 434-452).
Springer DOI
2010
BibRef
Bracha, L.[Lior],
Chechik, G.[Gal],
Informative Object Annotations: Tell Me Something I Don't Know,
CVPR19(12499-12507).
IEEE DOI
2002
BibRef
Shuster, K.[Kurt],
Humeau, S.[Samuel],
Hu, H.[Hexiang],
Bordes, A.[Antoine],
Weston, J.[Jason],
Engaging Image Captioning via Personality,
CVPR19(12508-12518).
IEEE DOI
2002
BibRef
Feng, Y.[Yang],
Ma, L.[Lin],
Liu, W.[Wei],
Luo, J.B.[Jie-Bo],
Unsupervised Image Captioning,
CVPR19(4120-4129).
IEEE DOI
2002
BibRef
Xu, Y.[Yan],
Wu, B.Y.[Bao-Yuan],
Shen, F.M.[Fu-Min],
Fan, Y.B.[Yan-Bo],
Zhang, Y.[Yong],
Shen, H.T.[Heng Tao],
Liu, W.[Wei],
Exact Adversarial Attack to Image Captioning via Structured Output
Learning With Latent Variables,
CVPR19(4130-4139).
IEEE DOI
2002
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Describing Like Humans: On Diversity in Image Captioning,
CVPR19(4190-4198).
IEEE DOI
2002
BibRef
Guo, L.T.[Long-Teng],
Liu, J.[Jing],
Yao, P.[Peng],
Li, J.W.[Jiang-Wei],
Lu, H.Q.[Han-Qing],
MSCap: Multi-Style Image Captioning With Unpaired Stylized Text,
CVPR19(4199-4208).
IEEE DOI
2002
BibRef
Zhang, L.[Lu],
Zhang, J.M.[Jian-Ming],
Lin, Z.[Zhe],
Lu, H.C.[Hu-Chuan],
He, Y.[You],
CapSal: Leveraging Captioning to Boost Semantics for Salient Object
Detection,
CVPR19(6017-6026).
IEEE DOI
2002
BibRef
Yin, G.J.[Guo-Jun],
Sheng, L.[Lu],
Liu, B.[Bin],
Yu, N.H.[Neng-Hai],
Wang, X.G.[Xiao-Gang],
Shao, J.[Jing],
Context and Attribute Grounded Dense Captioning,
CVPR19(6234-6243).
IEEE DOI
2002
BibRef
Gao, J.L.[Jun-Long],
Wang, S.Q.[Shi-Qi],
Wang, S.S.[Shan-She],
Ma, S.W.[Si-Wei],
Gao, W.[Wen],
Self-Critical N-Step Training for Image Captioning,
CVPR19(6293-6301).
IEEE DOI
2002
BibRef
Cornia, M.[Marcella],
Baraldi, L.[Lorenzo],
Cucchiara, R.[Rita],
Show, Control and Tell: A Framework for Generating Controllable and
Grounded Captions,
CVPR19(8299-8308).
IEEE DOI
2002
BibRef
Qin, Y.[Yu],
Du, J.J.[Jia-Jun],
Zhang, Y.H.[Yong-Hua],
Lu, H.T.[Hong-Tao],
Look Back and Predict Forward in Image Captioning,
CVPR19(8359-8367).
IEEE DOI
2002
BibRef
Zheng, Y.[Yue],
Li, Y.[Yali],
Wang, S.J.[Sheng-Jin],
Intention Oriented Image Captions With Guiding Objects,
CVPR19(8387-8396).
IEEE DOI
2002
BibRef
Huang, Y.,
Li, C.,
Li, T.,
Wan, W.,
Chen, J.,
Image Captioning with Attribute Refinement,
ICIP19(1820-1824)
IEEE DOI
1910
Image captioning, attribute recognition, Semantic attention,
Deep Neural Network, Conditional Random Field
BibRef
Lee, J.,
Lee, Y.,
Seong, S.,
Kim, K.,
Kim, S.,
Kim, J.,
Capturing Long-Range Dependencies in Video Captioning,
ICIP19(1880-1884)
IEEE DOI
1910
Video captioning, non-local block, long short-term memory,
long-range dependency, video representation
BibRef
Shi, J.,
Li, Y.,
Wang, S.,
Cascade Attention: Multiple Feature Based Learning for Image
Captioning,
ICIP19(1970-1974)
IEEE DOI
1910
Image Captioning, Attention Mechanism, Cascade Attention
BibRef
Wang, Y.,
Shen, Y.,
Xiong, H.,
Lin, W.,
Adaptive Hard Example Mining for Image Captioning,
ICIP19(3342-3346)
IEEE DOI
1910
Reinforcement Learning, Image Captioning
BibRef
Xiao, H.,
Shi, J.,
A Novel Attribute Selection Mechanism for Video Captioning,
ICIP19(619-623)
IEEE DOI
1910
Attributes, Video captioning, Attention, Reinforcement learning
BibRef
Lim, J.H.,
Chan, C.S.,
Mask Captioning Network,
ICIP19(1-5)
IEEE DOI
1910
Image captioning, Deep learning, Scene understanding
BibRef
Wang, Q.Z.[Qing-Zhong],
Chan, A.B.[Antoni B.],
Gated Hierarchical Attention for Image Captioning,
ACCV18(IV:21-37).
Springer DOI
1906
BibRef
Wang, W.[Weixuan],
Chen, Z.H.[Zhi-Hong],
Hu, H.F.[Hai-Feng],
Multivariate Attention Network for Image Captioning,
ACCV18(VI:587-602).
Springer DOI
1906
BibRef
Ghanimifard, M.[Mehdi],
Dobnik, S.[Simon],
Knowing When to Look for What and Where: Evaluating Generation of
Spatial Descriptions with Adaptive Attention,
VL18(IV:153-161).
Springer DOI
1905
See also Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning.
BibRef
Kim, B.[Boeun],
Lee, Y.H.[Young Han],
Jung, H.[Hyedong],
Cho, C.[Choongsang],
Distinctive-Attribute Extraction for Image Captioning,
VL18(IV:133-144).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Muscat, A.[Adrian],
Pre-gen Metrics: Predicting Caption Quality Metrics Without Generating
Captions,
VL18(IV:114-123).
Springer DOI
1905
BibRef
Tanti, M.[Marc],
Gatt, A.[Albert],
Camilleri, K.P.[Kenneth P.],
Quantifying the Amount of Visual Information Used by Neural Caption
Generators,
VL18(IV:124-132).
Springer DOI
1905
BibRef
Ren, L.,
Qi, G.,
Hua, K.,
Improving Diversity of Image Captioning Through Variational
Autoencoders and Adversarial Learning,
WACV19(263-272)
IEEE DOI
1904
image classification, image coding,
image segmentation, learning (artificial intelligence),
Maximum likelihood estimation
BibRef
Zhou, Y.,
Sun, Y.,
Honavar, V.,
Improving Image Captioning by Leveraging Knowledge Graphs,
WACV19(283-293)
IEEE DOI
1904
graph theory, image capture, image retrieval,
performance measure, image captioning systems, knowledge graphs,
Generators
BibRef
Rapson, C.J.,
Seet, B.,
Naeem, M.A.,
Lee, J.E.,
Al-Sarayreh, M.,
Klette, R.,
Reducing the Pain: A Novel Tool for Efficient Ground-Truth Labelling
in Images,
IVCNZ18(1-9)
IEEE DOI
1902
Labeling, Tools, Image segmentation, Image color analysis, Brushes,
Head, Automobiles, image labelling, annotations, segmentation,
image dataset
BibRef
Lu, J.S.[Jia-Sen],
Yang, J.W.[Jian-Wei],
Batra, D.[Dhruv],
Parikh, D.[Devi],
Neural Baby Talk,
CVPR18(7219-7228)
IEEE DOI
1812
Detectors, Visualization, Grounding, Pediatrics, Natural languages,
Dogs, Task analysis
BibRef
Wu, B.Y.[Bao-Yuan],
Chen, W.D.[Wei-Dong],
Sun, P.[Peng],
Liu, W.[Wei],
Ghanem, B.[Bernard],
Lyu, S.W.[Si-Wei],
Tagging Like Humans: Diverse and Distinct Image Annotation,
CVPR18(7967-7975)
IEEE DOI
1812
Semantics, Image annotation, Redundancy, Training,
Task analysis, Generators
BibRef
Wu, X.J.[Xin-Jian],
Zhang, L.[Li],
Li, F.Z.[Fan-Zhang],
Wang, B.J.[Bang-Jun],
A Novel Model for Multi-label Image Annotation,
ICPR18(1953-1958)
IEEE DOI
1812
Feature extraction, Image annotation, Computational modeling,
Semantics, Measurement, Visualization, Classification algorithms,
Multi-label learning
BibRef
Jiu, M.,
Sahbi, H.,
Qi, L.,
Deep Context Networks for Image Annotation,
ICPR18(2422-2427)
IEEE DOI
1812
image annotation, image classification,
learning (artificial intelligence), deep context networks,
Standards
BibRef
Khademi, M.,
Schulte, O.,
Image Caption Generation with Hierarchical Contextual Visual Spatial
Attention,
Cognitive18(2024-20248)
IEEE DOI
1812
Feature extraction, Visualization, Logic gates,
Computer architecture, Task analysis, Context modeling, Computational modeling
BibRef
Yan, S.,
Wu, F.,
Smith, J.S.,
Lu, W.,
Zhang, B.,
Image Captioning using Adversarial Networks and Reinforcement
Learning,
ICPR18(248-253)
IEEE DOI
1812
Generators, Generative adversarial networks,
Monte Carlo methods, Maximum likelihood estimation,
Task analysis
BibRef
Wang, F.,
Gong, X.,
Huang, L.,
Time-Dependent Pre-attention Model for Image Captioning,
ICPR18(3297-3302)
IEEE DOI
1812
Decoding, Task analysis, Semantics, Visualization,
Feature extraction, Computational modeling, Computer science
BibRef
Luo, R.,
Shakhnarovich, G.,
Cohen, S.,
Price, B.,
Discriminability Objective for Training Descriptive Captions,
CVPR18(6964-6974)
IEEE DOI
1812
Training, Task analysis, Visualization, Measurement,
Computational modeling, Generators, Airplanes
BibRef
Cui, Y.,
Yang, G.,
Veit, A.,
Huang, X.,
Belongie, S.,
Learning to Evaluate Image Captioning,
CVPR18(5804-5812)
IEEE DOI
1812
Measurement, Pathology, Training, Correlation, SPICE, Robustness, Task analysis
BibRef
Aneja, J.,
Deshpande, A.,
Schwing, A.G.,
Convolutional Image Captioning,
CVPR18(5561-5570)
IEEE DOI
1812
Training, Computer architecture, Task analysis,
Hidden Markov models, Microprocessors, Computational modeling, Indexing
BibRef
Chen, F.,
Ji, R.,
Sun, X.,
Wu, Y.,
Su, J.,
GroupCap: Group-Based Image Captioning with Structured Relevance and
Diversity Constraints,
CVPR18(1345-1353)
IEEE DOI
1812
Visualization, Correlation, Semantics, Feature extraction, Training,
Adaptation models, Task analysis
BibRef
Chen, X.,
Ma, L.,
Jiang, W.,
Yao, J.,
Liu, W.,
Regularizing RNNs for Caption Generation by Reconstructing the Past
with the Present,
CVPR18(7995-8003)
IEEE DOI
1812
Pattern recognition
BibRef
Yao, T.[Ting],
Pan, Y.W.[Ying-Wei],
Li, Y.[Yehao],
Mei, T.[Tao],
Exploring Visual Relationship for Image Captioning,
ECCV18(XIV: 711-727).
Springer DOI
1810
BibRef
Shah, S.A.A.[Syed Afaq Ali],
NNEval: Neural Network Based Evaluation Metric for Image Captioning,
ECCV18(VIII: 39-55).
Springer DOI
1810
BibRef
Jiang, W.H.[Wen-Hao],
Ma, L.[Lin],
Jiang, Y.G.[Yu-Gang],
Liu, W.[Wei],
Zhang, T.[Tong],
Recurrent Fusion Network for Image Captioning,
ECCV18(II: 510-526).
Springer DOI
1810
BibRef
Chatterjee, M.[Moitreya],
Schwing, A.G.[Alexander G.],
Diverse and Coherent Paragraph Generation from Images,
ECCV18(II: 747-763).
Springer DOI
1810
BibRef
Chen, S.[Shi],
Zhao, Q.[Qi],
Boosted Attention: Leveraging Human Attention for Image Captioning,
ECCV18(XI: 72-88).
Springer DOI
1810
BibRef
Dai, B.[Bo],
Ye, D.[Deming],
Lin, D.[Dahua],
Rethinking the Form of Latent States in Image Captioning,
ECCV18(VI: 294-310).
Springer DOI
1810
BibRef
Liu, X.H.[Xi-Hui],
Li, H.S.[Hong-Sheng],
Shao, J.[Jing],
Chen, D.P.[Da-Peng],
Wang, X.G.[Xiao-Gang],
Show, Tell and Discriminate:
Image Captioning by Self-retrieval with Partially Labeled Data,
ECCV18(XV: 353-369).
Springer DOI
1810
BibRef
Fang, F.,
Wang, H.,
Tang, P.,
Image Captioning with Word Level Attention,
ICIP18(1278-1282)
IEEE DOI
1809
Visualization, Feature extraction, Task analysis, Training,
Recurrent neural networks, Semantics, Computational modeling,
bidirectional spatial embedding
BibRef
Zhu, Z.,
Xue, Z.,
Yuan, Z.,
Topic-Guided Attention for Image Captioning,
ICIP18(2615-2619)
IEEE DOI
1809
Visualization, Semantics, Feature extraction, Training, Decoding,
Generators, Measurement, Image captioning, Attention, Topic, Attribute,
Deep Neural Network
BibRef
Gomez-Garay, A.[Alejandro],
Raducanu, B.[Bogdan],
Salas, J.[Joaquín],
Dense Captioning of Natural Scenes in Spanish,
MCPR18(145-154).
Springer DOI
1807
BibRef
Yao, L.[Li],
Ballas, N.[Nicolas],
Cho, K.[Kyunghyun],
Smith, J.[John],
Bengio, Y.[Yoshua],
Oracle Performance for Visual Captioning,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Khatchatoorian, A.G.,
Jamzad, M.,
Post Rectifying Methods to Improve the Accuracy of Image Annotation,
DICTA17(1-7)
IEEE DOI
1804
feature extraction, image annotation, image classification,
image retrieval, matrix algebra, Class-tag relation matrix,
Time division multiplexing
BibRef
Dong, H.[Hao],
Zhang, J.Q.[Jing-Qing],
McIlwraith, D.[Douglas],
Guo, Y.[Yike],
I2T2I: Learning text to image synthesis with textual data
augmentation,
ICIP17(2015-2019)
IEEE DOI
1803
Birds, Generators, Image generation,
Recurrent neural networks, Shape, Training, Deep learning, GAN, Image Synthesis
BibRef
Pellegrin, L.[Luis],
Escalante, H.J.[Hugo Jair],
Montes-y-Gómez, M.[Manuel],
Villegas, M.[Mauricio],
González, F.A.[Fabio A.],
A Flexible Framework for the Evaluation of Unsupervised Image
Annotation,
CIARP17(508-516).
Springer DOI
1802
BibRef
Jia, Y.H.[Yu-Hua],
Bai, L.[Liang],
Wang, P.[Peng],
Guo, J.L.[Jin-Lin],
Xie, Y.X.[Yu-Xiang],
Deep Convolutional Neural Network for Correlating Images and Sentences,
MMMod18(I:154-165).
Springer DOI
1802
BibRef
Liu, J.Y.[Jing-Yu],
Wang, L.[Liang],
Yang, M.H.[Ming-Hsuan],
Referring Expression Generation and Comprehension via Attributes,
ICCV17(4866-4874)
IEEE DOI
1802
Language Descriptions for objects.
learning (artificial intelligence), object detection, RefCOCO,
RefCOCO+, RefCOCOg, attribute learning model, common space model,
Visualization
BibRef
Dai, B.,
Fidler, S.,
Urtasun, R.,
Lin, D.,
Towards Diverse and Natural Image Descriptions via a Conditional GAN,
ICCV17(2989-2998)
IEEE DOI
1802
image retrieval, image sequences, inference mechanisms,
learning (artificial intelligence),
Visualization
BibRef
Liang, X.,
Hu, Z.,
Zhang, H.,
Gan, C.,
Xing, E.P.,
Recurrent Topic-Transition GAN for Visual Paragraph Generation,
ICCV17(3382-3391)
IEEE DOI
1802
document image processing, inference mechanisms, natural scenes,
recurrent neural nets, text analysis, RTT-GAN,
Visualization
BibRef
Shetty, R.,
Rohrbach, M.,
Hendricks, L.A.,
Fritz, M.,
Schiele, B.,
Speaking the Same Language:
Matching Machine to Human Captions by Adversarial Training,
ICCV17(4155-4164)
IEEE DOI
1802
image matching, learning (artificial intelligence),
sampling methods, vocabulary, adversarial training,
Visualization
BibRef
Liu, S.,
Zhu, Z.,
Ye, N.,
Guadarrama, S.,
Murphy, K.,
Improved Image Captioning via Policy Gradient optimization of SPIDEr,
ICCV17(873-881)
IEEE DOI
1802
Maximum likelihood estimation, Measurement, Mixers, Robustness,
SPICE, Training
BibRef
Gu, J.X.[Jiu-Xiang],
Joty, S.[Shafiq],
Cai, J.F.[Jian-Fei],
Wang, G.[Gang],
Unpaired Image Captioning by Language Pivoting,
ECCV18(I: 519-535).
Springer DOI
1810
BibRef
Gu, J.X.[Jiu-Xiang],
Wang, G.[Gang],
Cai, J.F.[Jian-Fei],
Chen, T.H.[Tsu-Han],
An Empirical Study of Language CNN for Image Captioning,
ICCV17(1231-1240)
IEEE DOI
1802
convolution, learning (artificial intelligence),
natural language processing, recurrent neural nets,
Recurrent neural networks
BibRef
Pedersoli, M.,
Lucas, T.,
Schmid, C.,
Verbeek, J.,
Areas of Attention for Image Captioning,
ICCV17(1251-1259)
IEEE DOI
1802
image segmentation, inference mechanisms,
natural language processing, object detection,
Visualization
BibRef
Zhang, Z.,
Wu, J.J.,
Li, Q.,
Huang, Z.,
Traer, J.,
McDermott, J.H.,
Tenenbaum, J.B.,
Freeman, W.T.,
Generative Modeling of Audible Shapes for Object Perception,
ICCV17(1260-1269)
IEEE DOI
1802
audio recording, audio signal processing, audio-visual systems,
feature extraction, inference mechanisms, interactive systems,
Visualization
BibRef
Liu, Z.J.[Zhi-Jian],
Freeman, W.T.[William T.],
Tenenbaum, J.B.[Joshua B.],
Wu, J.J.[Jia-Jun],
Physical Primitive Decomposition,
ECCV18(XII: 3-20).
Springer DOI
1810
BibRef
Wu, J.J.[Jia-Jun],
Lim, J.[Joseph],
Zhang, H.Y.[Hong-Yi],
Tenenbaum, J.B.[Joshua B.],
Freeman, W.T.[William T.],
Physics 101: Learning Physical Object Properties from Unlabeled Videos,
BMVC16(xx-yy).
HTML Version.
1805
BibRef
Tavakoliy, H.R.,
Shetty, R.,
Borji, A.,
Laaksonen, J.,
Paying Attention to Descriptions Generated by Image Captioning Models,
ICCV17(2506-2515)
IEEE DOI
1802
feature extraction, image processing, human descriptions,
human-written descriptions, image captioning model,
Visualization
BibRef
Tripathi, A.[Anurag],
Gupta, A.[Abhinav],
Chaudhary, S.[Santanu],
Lall, B.[Brejesh],
Image Annotation Using Latent Components and Transmedia Association,
PReMI17(493-500).
Springer DOI
1711
BibRef
Wu, B.Y.[Bao-Yuan],
Jia, F.[Fan],
Liu, W.[Wei],
Ghanem, B.[Bernard],
Diverse Image Annotation,
CVPR17(6194-6202)
IEEE DOI
1711
Correlation, Feature extraction, Measurement, Redundancy, Semantics
BibRef
Krause, J.[Jonathan],
Johnson, J.[Justin],
Krishna, R.[Ranjay],
Fei-Fei, L.[Li],
A Hierarchical Approach for Generating Descriptive Image Paragraphs,
CVPR17(3337-3345)
IEEE DOI
1711
Feature extraction, Natural languages, Pragmatics,
Recurrent neural networks, Speech, Visualization
BibRef
Vedantam, R.,
Bengio, S.,
Murphy, K.,
Parikh, D.,
Chechik, G.,
Context-Aware Captions from Context-Agnostic Supervision,
CVPR17(1070-1079)
IEEE DOI
1711
Birds, Cats, Cognition, Context modeling, Pragmatics, Training
BibRef
Gan, Z.,
Gan, C.,
He, X.,
Pu, Y.,
Tran, K.,
Gao, J.,
Carin, L.,
Deng, L.,
Semantic Compositional Networks for Visual Captioning,
CVPR17(1141-1150)
IEEE DOI
1711
Feature extraction, Mouth, Pediatrics, Semantics, Tensile stress,
Training, Visualization
BibRef
Ren, Z.,
Wang, X.,
Zhang, N.,
Lv, X.,
Li, L.J.,
Deep Reinforcement Learning-Based Image Captioning with Embedding
Reward,
CVPR17(1151-1159)
IEEE DOI
1711
Decision making, Learning (artificial intelligence), Measurement,
Neural networks, Training, Visualization
BibRef
Rennie, S.J.,
Marcheret, E.,
Mroueh, Y.,
Ross, J.,
Goel, V.,
Self-Critical Sequence Training for Image Captioning,
CVPR17(1179-1195)
IEEE DOI
1711
Inference algorithms, Learning (artificial intelligence),
Logic gates, Measurement, Predictive models, Training
BibRef
Yang, L.,
Tang, K.,
Yang, J.,
Li, L.J.,
Dense Captioning with Joint Inference and Visual Context,
CVPR17(1978-1987)
IEEE DOI
1711
Bioinformatics, Genomics, Object detection, Proposals, Semantics,
Training, Visualization
BibRef
Lu, J.,
Xiong, C.,
Parikh, D.,
Socher, R.,
Knowing When to Look: Adaptive Attention via a Visual Sentinel for
Image Captioning,
CVPR17(3242-3250)
IEEE DOI
1711
Adaptation models, Computational modeling, Context modeling,
Decoding, Logic gates, Mathematical model, Visualization
BibRef
Yao, T.,
Pan, Y.,
Li, Y.,
Mei, T.,
Incorporating Copying Mechanism in Image Captioning for Learning
Novel Objects,
CVPR17(5263-5271)
IEEE DOI
1711
Decoding, Hidden Markov models, Object recognition,
Recurrent neural networks, Standards, Training, Visualization
BibRef
Chen, L.,
Zhang, H.,
Xiao, J.,
Nie, L.,
Shao, J.,
Liu, W.,
Chua, T.S.,
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks
for Image Captioning,
CVPR17(6298-6306)
IEEE DOI
1711
Detectors, Feature extraction, Image coding, Neural networks,
Semantics, Visualization
BibRef
Sun, Q.,
Lee, S.,
Batra, D.,
Bidirectional Beam Search: Forward-Backward Inference in Neural
Sequence Models for Fill-in-the-Blank Image Captioning,
CVPR17(7215-7223)
IEEE DOI
1711
Approximation algorithms, Computational modeling, Decoding,
History, Inference algorithms, Recurrent, neural, networks
BibRef
Wang, Y.,
Lin, Z.,
Shen, X.,
Cohen, S.,
Cottrell, G.W.,
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition,
CVPR17(7378-7387)
IEEE DOI
1711
Measurement, Recurrent neural networks, SPICE, Semantics, Skeleton, Training
BibRef
Zanfir, M.[Mihai],
Marinoiu, E.[Elisabeta],
Sminchisescu, C.[Cristian],
Spatio-Temporal Attention Models for Grounded Video Captioning,
ACCV16(IV: 104-119).
Springer DOI
1704
BibRef
Chen, T.H.[Tseng-Hung],
Zeng, K.H.[Kuo-Hao],
Hsu, W.T.[Wan-Ting],
Sun, M.[Min],
Video Captioning via Sentence Augmentation and Spatio-Temporal
Attention,
Assist16(I: 269-286).
Springer DOI
1704
BibRef
Tan, Y.H.[Ying Hua],
Chan, C.S.[Chee Seng],
phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning,
ACCV16(V: 101-117).
Springer DOI
1704
BibRef
Weiland, L.[Lydia],
Hulpus, I.[Ioana],
Ponzetto, S.P.[Simone Paolo],
Dietz, L.[Laura],
Using Object Detection, NLP, and Knowledge Bases to Understand the
Message of Images,
MMMod17(II: 405-418).
Springer DOI
1701
BibRef
Liu, Y.[Yu],
Guo, Y.M.[Yan-Ming],
Lew, M.S.[Michael S.],
What Convnets Make for Image Captioning?,
MMMod17(I: 416-428).
Springer DOI
1701
BibRef
Tran, K.,
He, X.,
Zhang, L.,
Sun, J.,
Rich Image Captioning in the Wild,
DeepLearn-C16(434-441)
IEEE DOI
1612
BibRef
Wang, Y.L.[Yi-Lin],
Wang, S.H.[Su-Hang],
Tang, J.L.[Ji-Liang],
Liu, H.[Huan],
Li, B.X.[Bao-Xin],
PPP: Joint Pointwise and Pairwise Image Label Prediction,
CVPR16(6005-6013)
IEEE DOI
1612
BibRef
Sadhu, A.[Arka],
Gupta, T.[Tanmay],
Yatskar, M.[Mark],
Nevatia, R.[Ram],
Kembhavi, A.[Aniruddha],
Visual Semantic Role Labeling for Video Understanding,
CVPR21(5585-5596)
IEEE DOI
2111
Visualization, Annotations, Semantics,
Benchmark testing, Motion pictures, Pattern recognition
BibRef
Yatskar, M.[Mark],
Ordonez, V.,
Zettlemoyer, L.[Luke],
Farhadi, A.[Ali],
Commonly Uncommon: Semantic Sparsity in Situation Recognition,
CVPR17(6335-6344)
IEEE DOI
1711
BibRef
Earlier: A1, A3, A4, Only:
Situation Recognition: Visual Semantic Role Labeling for Image
Understanding,
CVPR16(5534-5542)
IEEE DOI
1612
Image recognition, Image representation, Predictive models,
Semantics, Tensile stress, Training
BibRef
Kottur, S.[Satwik],
Vedantam, R.[Ramakrishna],
Moura, J.M.F.[José M. F.],
Parikh, D.[Devi],
VisualWord2Vec (Vis-W2V):
Learning Visually Grounded Word Embeddings Using Abstract Scenes,
CVPR16(4985-4994)
IEEE DOI
1612
BibRef
Zhu, Y.,
Groth, O.,
Bernstein, M.,
Fei-Fei, L.,
Visual7W: Grounded Question Answering in Images,
CVPR16(4995-5004)
IEEE DOI
1612
BibRef
Zhang, P.,
Goyal, Y.,
Summers-Stay, D.,
Batra, D.,
Parikh, D.,
Yin and Yang: Balancing and Answering Binary Visual Questions,
CVPR16(5014-5022)
IEEE DOI
1612
BibRef
Park, D.H.,
Darrell, T.J.,
Rohrbach, A.,
Robust Change Captioning,
ICCV19(4623-4632)
IEEE DOI
2004
feature extraction, learning (artificial intelligence),
natural language processing, object-oriented programming, Predictive models
BibRef
Venugopalan, S.[Subhashini],
Hendricks, L.A.[Lisa Anne],
Rohrbach, M.[Marcus],
Mooney, R.[Raymond],
Darrell, T.J.[Trevor J.],
Saenko, K.[Kate],
Captioning Images with Diverse Objects,
CVPR17(1170-1178)
IEEE DOI
1711
BibRef
Earlier: A2, A1, A3, A4, A6, A5:
Deep Compositional Captioning: Describing Novel Object Categories
without Paired Training Data,
CVPR16(1-10)
IEEE DOI
1612
Data models, Image recognition, Predictive models, Semantics,
Training, Visualization.
Novel objects not in training data.
BibRef
Johnson, J.[Justin],
Karpathy, A.[Andrej],
Fei-Fei, L.[Li],
DenseCap:
Fully Convolutional Localization Networks for Dense Captioning,
CVPR16(4565-4574)
IEEE DOI
1612
Both localize and describe salient regions in images in natural language.
BibRef
Wang, M.[Minsi],
Song, L.[Li],
Yang, X.K.[Xiao-Kang],
Luo, C.F.[Chuan-Fei],
A parallel-fusion RNN-LSTM architecture for image caption generation,
ICIP16(4448-4452)
IEEE DOI
1610
Computational modeling
deep convolutional networks and recurrent neural networks.
BibRef
Lin, X.[Xiao],
Parikh, D.[Devi],
Leveraging Visual Question Answering for Image-Caption Ranking,
ECCV16(II: 261-277).
Springer DOI
1611
BibRef
Earlier:
Don't just listen, use your imagination:
Leveraging visual common sense for non-visual tasks,
CVPR15(2984-2993)
IEEE DOI
1510
BibRef
Chen, T.L.[Tian-Lang],
Zhang, Z.P.[Zhong-Ping],
You, Q.Z.[Quan-Zeng],
Fang, C.[Chen],
Wang, Z.W.[Zhao-Wen],
Jin, H.L.[Hai-Lin],
Luo, J.B.[Jie-Bo],
'Factual' or 'Emotional':
Stylized Image Captioning with Adaptive Learning and Attention,
ECCV18(X: 527-543).
Springer DOI
1810
BibRef
You, Q.Z.[Quan-Zeng],
Jin, H.L.[Hai-Lin],
Wang, Z.W.[Zhao-Wen],
Fang, C.[Chen],
Luo, J.B.[Jie-Bo],
Image Captioning with Semantic Attention,
CVPR16(4651-4659)
IEEE DOI
1612
BibRef
Jia, X.[Xu],
Gavves, E.[Efstratios],
Fernando, B.[Basura],
Tuytelaars, T.[Tinne],
Guiding the Long-Short Term Memory Model for Image Caption Generation,
ICCV15(2407-2415)
IEEE DOI
1602
Computer architecture
BibRef
Chen, X.L.[Xin-Lei],
Zitnick, C.L.[C. Lawrence],
Mind's eye:
A recurrent visual representation for image caption generation,
CVPR15(2422-2431)
IEEE DOI
1510
BibRef
Vedantam, R.[Ramakrishna],
Zitnick, C.L.[C. Lawrence],
Parikh, D.[Devi],
CIDEr: Consensus-based image description evaluation,
CVPR15(4566-4575)
IEEE DOI
1510
BibRef
Fang, H.[Hao],
Gupta, S.[Saurabh],
Iandola, F.[Forrest],
Srivastava, R.K.[Rupesh K.],
Deng, L.[Li],
Dollar, P.[Piotr],
Gao, J.F.[Jian-Feng],
He, X.D.[Xiao-Dong],
Mitchell, M.[Margaret],
Platt, J.C.[John C.],
Zitnick, C.L.[C. Lawrence],
Zweig, G.[Geoffrey],
From captions to visual concepts and back,
CVPR15(1473-1482)
IEEE DOI
1510
BibRef
Ramnath, K.[Krishnan],
Baker, S.[Simon],
Vanderwende, L.[Lucy],
El-Saban, M.[Motaz],
Sinha, S.N.[Sudipta N.],
Kannan, A.[Anitha],
Hassan, N.[Noran],
Galley, M.[Michel],
Yang, Y.[Yi],
Ramanan, D.[Deva],
Bergamo, A.[Alessandro],
Torresani, L.[Lorenzo],
AutoCaption: Automatic caption generation for personal photos,
WACV14(1050-1057)
IEEE DOI
1406
Clouds
BibRef
Chapter on Matching and Recognition Using Volumes, High Level Vision Techniques, Invariants continues in
Multi-Modal, Cross-Modal Captioning, Image Captioning .