Peng, Y.X.[Yu-Xin],
Qi, J.W.[Jin-Wei],
Show and Tell in the Loop: Cross-Modal Circular Correlation Learning,
MultMed(21), No. 6, June 2019, pp. 1538-1550.
IEEE DOI
1906
Correlation, Bridges, Logic gates, Semantics, Task analysis, Cognition,
Feeds, Circular correlation learning, cross-modal retrieval,
text-to-image synthesis
BibRef
Zhang, X.W.[Xin-Wei],
Wang, J.[Jin],
Lu, G.D.[Guo-Dong],
Zhang, X.S.[Xu-Sheng],
Pattern understanding and synthesis based on layout tree descriptor,
VC(36), No. 6, June 2020, pp. 1141-1155.
WWW Link.
2005
BibRef
Baraheem, S.S.[Samah S.],
Nguyen, T.V.[Tam V.],
Text-to-image via mask anchor points,
PRL(133), 2020, pp. 25-32.
Elsevier DOI
2005
Text-to-image, Mask dataset, Image synthesis, Anchor points
BibRef
Chen, Q.[Qi],
Wu, Q.[Qi],
Chen, J.[Jian],
Wu, Q.Y.[Qing-Yao],
van den Hengel, A.J.[Anton J.],
Tan, M.K.[Ming-Kui],
Scripted Video Generation With a Bottom-Up Generative Adversarial
Network,
IP(29), 2020, pp. 7454-7467.
IEEE DOI
2007
Generative adversarial networks, video generation,
semantic alignment, temporal coherence
BibRef
Yang, M.[Min],
Liu, J.H.[Jun-Hao],
Shen, Y.[Ying],
Zhao, Z.[Zhou],
Chen, X.J.[Xiao-Jun],
Wu, Q.Y.[Qing-Yao],
Li, C.M.[Cheng-Ming],
An Ensemble of Generation- and Retrieval-Based Image Captioning With
Dual Generator Generative Adversarial Network,
IP(29), 2020, pp. 9627-9640.
IEEE DOI
2011
Generators, Decoding, Generative adversarial networks, Training,
Computational modeling, Task analysis, Image captioning,
adversarial learning
BibRef
Yuan, M.,
Peng, Y.,
CKD: Cross-Task Knowledge Distillation for Text-to-Image Synthesis,
MultMed(22), No. 8, August 2020, pp. 1955-1968.
IEEE DOI
2007
Semantics, Visualization, Task analysis, Image synthesis,
Generative adversarial networks, Neural networks,
image semantic understanding
BibRef
Osahor, U.,
Kazemi, H.,
Dabouei, A.,
Nasrabadi, N.,
Quality Guided Sketch-to-Photo Image Synthesis,
Biometrics20(3575-3584)
IEEE DOI
2008
Pattern recognition
BibRef
Zhao, B.[Bo],
Yin, W.D.[Wei-Dong],
Meng, L.L.[Li-Li],
Sigal, L.[Leonid],
Layout2image: Image Generation from Layout,
IJCV(128), No. 10-11, November 2020, pp. 2418-2435.
Springer DOI
2009
BibRef
Earlier: A1, A3, A2, A4:
Image Generation From Layout,
CVPR19(8576-8585).
IEEE DOI
2002
BibRef
Sheng, L.[Lu],
Pan, J.T.[Jun-Ting],
Guo, J.M.[Jia-Ming],
Shao, J.[Jing],
Loy, C.C.[Chen Change],
High-Quality Video Generation from Static Structural Annotations,
IJCV(128), No. 10-11, November 2020, pp. 2552-2569.
Springer DOI
2009
BibRef
Li, K.[Ke],
Peng, S.C.[Shi-Chong],
Zhang, T.H.[Tian-Hao],
Malik, J.[Jitendra],
Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood
Estimation,
IJCV(128), No. 10-11, November 2020, pp. 2607-2628.
Springer DOI
2009
BibRef
Earlier: A1, A3, A4, Only:
Diverse Image Synthesis From Semantic Layouts via Conditional IMLE,
ICCV19(4219-4228)
IEEE DOI
2004
image representation, image segmentation,
learning (artificial intelligence),
Probabilistic logic
BibRef
Arora, H.[Himanshu],
Mishra, S.[Saurabh],
Peng, S.C.[Shi-Chong],
Li, K.[Ke],
Mahdavi-Amiri, A.[Ali],
Multimodal Shape Completion via Implicit Maximum Likelihood
Estimation,
DLGC22(2957-2966)
IEEE DOI
2210
Point cloud compression, Maximum likelihood estimation, Shape,
Conferences
BibRef
Gao, L.L.[Lian-Li],
Chen, D.Y.[Dai-Yuan],
Zhao, Z.[Zhou],
Shao, J.[Jie],
Shen, H.T.[Heng Tao],
Lightweight dynamic conditional GAN with pyramid attention for
text-to-image synthesis,
PR(110), 2021, pp. 107384.
Elsevier DOI
2011
Text-to-image synthesis,
Conditional generative adversarial network (CGAN),
Pyramid attentive fusion
BibRef
Dong, Y.L.[Yan-Long],
Zhang, Y.[Ying],
Ma, L.[Lin],
Wang, Z.[Zhi],
Luo, J.B.[Jie-Bo],
Unsupervised text-to-image synthesis,
PR(110), 2021, pp. 107573.
Elsevier DOI
2011
Text-to-image synthesis, Generative adversarial network (GAN),
Unsupervised training
BibRef
Yuan, M.,
Peng, Y.,
Bridge-GAN: Interpretable Representation Learning for Text-to-Image
Synthesis,
CirSysVideo(30), No. 11, November 2020, pp. 4258-4268.
IEEE DOI
2011
Visualization, Mutual information, Image synthesis, Task analysis,
Training, Bridge circuits, Semantics, Text-to-image synthesis,
Bridge-GAN
BibRef
Li, R.F.[Rui-Fan],
Wang, N.[Ning],
Feng, F.X.[Fang-Xiang],
Zhang, G.W.[Guang-Wei],
Wang, X.J.[Xiao-Jie],
Exploring Global and Local Linguistic Representations for
Text-to-Image Synthesis,
MultMed(22), No. 12, December 2020, pp. 3075-3087.
IEEE DOI
2011
Task analysis, Linguistics, Generators,
Generative adversarial networks, Training, Correlation,
cross-modal
BibRef
Li, C.Y.[Chun-Ye],
Kong, L.Y.[Li-Ya],
Zhou, Z.P.[Zhi-Ping],
Improved-StoryGAN for sequential images visualization,
JVCIR(73), 2020, pp. 102956.
Elsevier DOI
2012
Story visualization, Weighted Activation Degree (WAD),
Dilated Convolution, Gated Convolution
BibRef
Tan, H.,
Liu, X.,
Liu, M.,
Yin, B.,
Li, X.,
KT-GAN: Knowledge-Transfer Generative Adversarial Network for
Text-to-Image Synthesis,
IP(30), 2021, pp. 1275-1290.
IEEE DOI
2012
Task analysis, Semantics, Generators,
Generative adversarial networks, Knowledge engineering,
alternate attention-transfer mechanism
BibRef
Wang, M.[Min],
Lang, C.Y.[Cong-Yan],
Feng, S.H.[Song-He],
Wang, T.[Tao],
Jin, Y.[Yi],
Li, Y.D.[Yi-Dong],
Text to photo-realistic image synthesis via chained deep recurrent
generative adversarial network,
JVCIR(74), 2021, pp. 102955.
Elsevier DOI
2101
Text-to-image synthesis, Logic relationships,
Computational bottlenecks, Parameters sharing
BibRef
Yang, Y.,
Wang, L.,
Xie, D.,
Deng, C.,
Tao, D.,
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained
Text-to-Image Synthesis,
IP(30), 2021, pp. 2798-2809.
IEEE DOI
2102
Semantics, Task analysis, Visualization, Training,
Generative adversarial networks, Correlation, Birds,
negative sample learning
BibRef
Elu, A.[Aitzol],
Azkune, G.[Gorka],
de Lacalle, O.L.[Oier Lopez],
Arganda-Carreras, I.[Ignacio],
Soroa, A.[Aitor],
Agirre, E.[Eneko],
Inferring spatial relations from textual descriptions of images,
PR(113), 2021, pp. 107847.
Elsevier DOI
2103
Text-to-image synthesis, Natural language understanding,
Spatial relations, Deep learning
BibRef
Hu, T.[Tao],
Long, C.J.[Cheng-Jiang],
Xiao, C.X.[Chun-Xia],
A Novel Visual Representation on Text Using Diverse Conditional GAN
for Visual Recognition,
IP(30), 2021, pp. 3499-3512.
IEEE DOI
2103
Use text from social media to train image recognition.
Visualization, Feature extraction, Image recognition,
Text recognition, Generators,
visual recognition
BibRef
Yang, C.Y.[Ce-Yuan],
Shen, Y.J.[Yu-Jun],
Zhou, B.L.[Bo-Lei],
Semantic Hierarchy Emerges in Deep Generative Representations for Scene
Synthesis,
IJCV(129), No. 5, May 2021, pp. 1451-1466.
Springer DOI
2105
BibRef
Qi, Z.J.[Zhong-Jian],
Fan, C.G.[Chao-Gang],
Xu, L.F.[Liang-Feng],
Li, X.K.[Xin-Ke],
Zhan, S.[Shu],
MRP-GAN: Multi-resolution parallel generative adversarial networks
for text-to-image synthesis,
PRL(147), 2021, pp. 1-7.
Elsevier DOI
2106
Text-to-image synthesize, Generative adversarial networks, Image generation
BibRef
Li, Z.[Zeyu],
Deng, C.[Cheng],
Yang, E.K.[Er-Kun],
Tao, D.C.[Da-Cheng],
Staged Sketch-to-Image Synthesis via Semi-Supervised Generative
Adversarial Networks,
MultMed(23), 2021, pp. 2694-2705.
IEEE DOI
2109
Generative adversarial networks, Image generation,
Training, Image edge detection, Task analysis, sketch
BibRef
Zheng, J.B.[Jian-Bin],
Liu, D.Q.[Da-Qing],
Wang, C.Y.[Chao-Yue],
Hu, M.H.[Ming-Hui],
Yang, Z.P.[Zuo-Peng],
Ding, C.X.[Chang-Xing],
Tao, D.C.[Da-Cheng],
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal
Conditional Image Synthesis,
IJCV(132), No. 1, January 2024, pp. 3537-3565.
Springer DOI
2409
BibRef
Rafique, M.U.[Muhammad Usman],
Zhang, Y.[Yu],
Brodie, B.[Benjamin],
Jacobs, N.[Nathan],
Unifying Guided and Unguided Outdoor Image Synthesis,
NTIRE21(776-785)
IEEE DOI
2109
Training, Image synthesis, Impedance matching, Layout,
Benchmark testing, Probabilistic logic
BibRef
Wang, M.[Min],
Lang, C.Y.[Cong-Yan],
Liang, L.Q.[Li-Qian],
Lyu, G.[Gengyu],
Feng, S.H.[Song-He],
Wang, T.[Tao],
Class-Balanced Text to Image Synthesis With Attentive Generative
Adversarial Network,
MultMedMag(28), No. 3, July 2021, pp. 21-31.
IEEE DOI
2109
Generative adversarial networks, Training data, Semantics,
Text processing, Image synthesis, generative adversarial network,
rebalance
BibRef
Li, A.[Ailin],
Zhao, L.[Lei],
Zuo, Z.W.[Zhi-Wen],
Wang, Z.Z.[Zhi-Zhong],
Chen, H.B.[Hai-Bo],
Lu, D.M.[Dong-Ming],
Xing, W.[Wei],
Diversified text-to-image generation via deep mutual information
estimation,
CVIU(211), 2021, pp. 103259.
Elsevier DOI
2110
Generative Adversarial Nets (GANs), Text-to-image generation, Mutual Information
BibRef
Wu, F.X.[Fu-Xiang],
Cheng, J.[Jun],
Wang, X.C.[Xin-Chao],
Wang, L.[Lei],
Tao, D.P.[Da-Peng],
Image Hallucination From Attribute Pairs,
Cyber(52), No. 1, January 2022, pp. 568-581.
IEEE DOI
2201
Semantics, Visualization, Generators, Syntactics,
Training, Natural language processing, text-to-image synthesis
BibRef
Hinz, T.[Tobias],
Heinrich, S.[Stefan],
Wermter, S.[Stefan],
Semantic Object Accuracy for Generative Text-to-Image Synthesis,
PAMI(44), No. 3, March 2022, pp. 1552-1565.
IEEE DOI
2202
Layout, Semantics, Measurement, Generators, Image resolution,
Image quality, Text-to-image synthesis,
generative models
BibRef
Tan, H.C.[Hong-Chen],
Liu, X.P.[Xiu-Ping],
Yin, B.C.[Bao-Cai],
Li, X.[Xin],
Cross-Modal Semantic Matching Generative Adversarial Networks for
Text-to-Image Synthesis,
MultMed(24), 2022, pp. 832-845.
IEEE DOI
2202
Semantics, Task analysis, Generative adversarial networks,
Generators, Feature extraction, Visualization,
text _CNNs
BibRef
Feng, F.X.[Fang-Xiang],
Niu, T.R.[Tian-Rui],
Li, R.F.[Rui-Fan],
Wang, X.J.[Xiao-Jie],
Modality Disentangled Discriminator for Text-to-Image Synthesis,
MultMed(24), No. 2022, pp. 2112-2124.
IEEE DOI
2204
Task analysis, Correlation, Image synthesis, Image reconstruction,
Generative adversarial networks, Image representation,
multi-modal disentangled representation learning
BibRef
Tan, Y.X.[Yong Xuan],
Lee, C.P.[Chin Poo],
Neo, M.[Mai],
Lim, K.M.[Kian Ming],
Text-to-image synthesis with self-supervised learning,
PRL(157), 2022, pp. 119-126.
Elsevier DOI
2205
Text-to-image-synthesis, Generative adversarial network,
Self-supervised learning
BibRef
Tan, Y.X.[Yong Xuan],
Lee, C.P.[Chin Poo],
Neo, M.[Mai],
Lim, K.M.[Kian Ming],
Lim, J.Y.[Jit Yan],
Text-to-image synthesis with self-supervised bi-stage generative
adversarial network,
PRL(169), 2023, pp. 43-49.
Elsevier DOI
2305
Text-to-image-synthesis, Generative adversarial network,
Self-supervised learning, GAN
BibRef
Quan, F.[Fengnan],
Lang, B.[Bo],
Liu, Y.X.[Yan-Xi],
ARRPNGAN: Text-to-image GAN with attention regularization and region
proposal networks,
SP:IC(106), 2022, pp. 116728.
Elsevier DOI
2206
Text-to-image synthesis, Generative adversarial network,
Attention model, Region proposal network
BibRef
Wang, H.X.[Hong-Xia],
Ke, H.[Hao],
Liu, C.[Chun],
An embedded method: Improve the relevance of text and face image with
enhanced face attributes,
SP:IC(108), 2022, pp. 116815.
Elsevier DOI
2209
Generative adversarial networks,
Text-to-image face image generation, Face synthesis, Visual attributes
BibRef
Peng, J.[Jun],
Zhou, Y.[Yiyi],
Sun, X.S.[Xiao-Shuai],
Cao, L.J.[Liu-Juan],
Wu, Y.J.[Yong-Jian],
Huang, F.Y.[Fei-Yue],
Ji, R.R.[Rong-Rong],
Knowledge-Driven Generative Adversarial Network for Text-to-Image
Synthesis,
MultMed(24), 2022, pp. 4356-4366.
IEEE DOI
2210
Visualization, Generative adversarial networks, Task analysis,
Semantics, Measurement, Image synthesis, Feature extraction,
pseudo turing test
BibRef
Mazaheri, A.[Amir],
Shah, M.[Mubarak],
Video Generation from Text Employing Latent Path Construction for
Temporal Modeling,
ICPR22(5010-5016)
IEEE DOI
2212
Interpolation, Visualization, Natural languages,
Stacking, Machine learning
BibRef
Gu, J.J.[Jin-Jing],
Wang, H.L.[Han-Li],
Fan, R.C.[Rui-Chao],
Coherent Visual Storytelling via Parallel Top-Down Visual and Topic
Attention,
CirSysVideo(33), No. 1, January 2023, pp. 257-268.
IEEE DOI
2301
Visualization, Decoding, Neural networks, Coherence, Task analysis,
Image sequences, Feature extraction, Visual storytelling,
phrase beam search
BibRef
Li, T.P.[Teng-Peng],
Wang, H.L.[Han-Li],
He, B.[Bin],
Chen, C.W.[Chang Wen],
Knowledge-Enriched Attention Network With Group-Wise Semantic for
Visual Storytelling,
PAMI(45), No. 7, July 2023, pp. 8634-8645.
IEEE DOI
2306
Visualization, Semantics, Feature extraction, Decoding,
Streaming media, GSM, Technological innovation, Encoder-decoder,
visual storytelling
BibRef
Gao, L.[Lin],
Sun, J.M.[Jia-Mu],
Mo, K.[Kaichun],
Lai, Y.K.[Yu-Kun],
Guibas, L.J.[Leonidas J.],
Yang, J.[Jie],
SceneHGN: Hierarchical Graph Networks for 3D Indoor Scene Generation
With Fine-Grained Geometry,
PAMI(45), No. 7, July 2023, pp. 8902-8919.
IEEE DOI
2306
Geometry, Layout, Shape, Solid modeling, Neural networks,
Interpolation, 3D indoor scene synthesis, deep generative model,
variational autoencoder
BibRef
Hou, X.X.[Xian-Xu],
Zhang, X.K.[Xiao-Kang],
Li, Y.D.[Yu-Dong],
Shen, L.L.[Lin-Lin],
TextFace: Text-to-Style Mapping Based Face Generation and
Manipulation,
MultMed(25), 2023, pp. 3409-3419.
IEEE DOI
2309
BibRef
Liu, S.Y.[Si-Ying],
Dragotti, P.L.[Pier Luigi],
Sensing Diversity and Sparsity Models for Event Generation and Video
Reconstruction from Events,
PAMI(45), No. 10, October 2023, pp. 12444-12458.
IEEE DOI
2310
Event to video.
BibRef
Tan, Z.R.[Zhao-Rui],
Yang, X.[Xi],
Ye, Z.H.[Zi-Han],
Wang, Q.[Qiufeng],
Yan, Y.[Yuyao],
Nguyen, A.[Anh],
Huang, K.[Kaizhu],
Semantic Similarity Distance: Towards better text-image consistency
metric in text-to-image generation,
PR(144), 2023, pp. 109883.
Elsevier DOI
2310
Text-to-image, Image generation,
Generative adversarial networks, Semantic consistency
BibRef
Cheng, Q.R.[Qing-Rong],
Wen, K.Y.[Ke-Yu],
Gu, X.D.[Xiao-Dong],
Vision-Language Matching for Text-to-Image Synthesis via Generative
Adversarial Networks,
MultMed(25), 2023, pp. 7062-7075.
IEEE DOI
2311
BibRef
Gao, L.L.[Lian-Li],
Zhao, Q.[Qike],
Zhu, J.C.[Jun-Chen],
Su, S.[Sitong],
Cheng, L.[Lechao],
Zhao, L.[Lei],
From External to Internal: Structuring Image for Text-to-Image
Attributes Manipulation,
MultMed(25), 2023, pp. 7248-7261.
IEEE DOI Code:
WWW Link.
2311
BibRef
Sun, M.Z.[Ming-Zhen],
Wang, W.N.[Wei-Ning],
Zhu, X.X.[Xin-Xin],
Liu, J.[Jing],
Reparameterizing and dynamically quantizing image features for image
generation,
PR(146), 2024, pp. 109962.
Elsevier DOI
2311
Vector quantization, Variational auto-encoder,
Unconditional image generation, Text-to-image generation,
Autoregressive generation
BibRef
Liu, Z.Z.[Zheng-Zhe],
Dai, P.[Peng],
Li, R.[Ruihui],
Qi, X.J.[Xiao-Juan],
Fu, C.W.[Chi-Wing],
DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape
Generation,
PAMI(45), No. 12, December 2023, pp. 14385-14403.
IEEE DOI
2311
BibRef
Tang, Z.M.[Zheng-Mi],
Miyazaki, T.[Tomo],
Omachi, S.[Shinichiro],
A Scene-Text Synthesis Engine Achieved Through Learning From
Decomposed Real-World Data,
IP(32), 2023, pp. 5837-5851.
IEEE DOI Code:
WWW Link.
2311
BibRef
Xu, Y.H.[Yong-Hao],
Yu, W.[Weikang],
Ghamisi, P.[Pedram],
Kopp, M.[Michael],
Hochreiter, S.[Sepp],
Txt2Img-MHN: Remote Sensing Image Generation From Text Using Modern
Hopfield Networks,
IP(32), 2023, pp. 5737-5750.
IEEE DOI Code:
WWW Link.
2311
BibRef
Tan, H.C.[Hong-Chen],
Yin, B.C.[Bao-Cai],
Wei, K.[Kun],
Liu, X.P.[Xiu-Ping],
Li, X.[Xin],
ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis,
MultMed(25), 2023, pp. 8620-8631.
IEEE DOI
2312
BibRef
Liang, J.D.[Jia-Dong],
Pei, W.J.[Wen-Jie],
Lu, F.[Feng],
Layout-Bridging Text-to-Image Synthesis,
CirSysVideo(33), No. 12, December 2023, pp. 7438-7451.
IEEE DOI
2312
BibRef
Kuang, Y.[Yi],
Ma, F.[Fei],
Li, F.F.[Fang-Fang],
Liu, Y.B.[Ying-Bing],
Zhang, F.[Fan],
Semantic-Layout-Guided Image Synthesis for High-Quality
Synthetic-Aperature Radar Detection Sample Generation,
RS(15), No. 24, 2023, pp. 5654.
DOI Link
2401
BibRef
Liu, A.A.[An-An],
Sun, Z.F.[Ze-Fang],
Xu, N.[Ning],
Kang, R.B.[Rong-Bao],
Cao, J.[Jinbo],
Yang, F.[Fan],
Qin, W.J.[Wei-Jun],
Zhang, S.Y.[Shen-Yuan],
Zhang, J.Q.[Jia-Qi],
Li, X.[Xuanya],
Prior knowledge guided text to image generation,
PRL(177), 2024, pp. 89-95.
Elsevier DOI
2401
Text-to-image synthesis, Generative Adversarial Networks, Knowledge Guided GAN
BibRef
Köksal, A.[Ali],
Ak, K.E.[Kenan E.],
Sun, Y.[Ying],
Rajan, D.[Deepu],
Lim, J.H.[Joo Hwee],
Controllable Video Generation With Text-Based Instructions,
MultMed(26), 2024, pp. 190-201.
IEEE DOI
2401
BibRef
Liu, J.W.[Jia-Wei],
Wang, W.N.[Wei-Ning],
Chen, S.[Sihan],
Zhu, X.X.[Xin-Xin],
Liu, J.[Jing],
Sounding Video Generator: A Unified Framework for Text-Guided
Sounding Video Generation,
MultMed(26), 2024, pp. 141-153.
IEEE DOI
2401
BibRef
Ye, S.M.[Sen-Mao],
Wang, H.[Huan],
Tan, M.K.[Ming-Kui],
Liu, F.[Fei],
Recurrent Affine Transformation for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 462-473.
IEEE DOI
2402
Generators, Visualization, Fuses, Computational modeling,
Generative adversarial networks, Training, Task analysis,
spatial attention
BibRef
Yuan, B.[Bowen],
Sheng, Y.F.[Ye-Fei],
Bao, B.K.[Bing-Kun],
Chen, Y.P.P.[Yi-Ping Phoebe],
Xu, C.S.[Chang-Sheng],
Semantic Distance Adversarial Learning for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 1255-1266.
IEEE DOI
2402
Semantics, Generators, Training, Adversarial machine learning,
Feature extraction, Generative adversarial networks, Birds, cycle consistency
BibRef
Zhou, H.P.[Hua-Ping],
Wu, T.[Tao],
Ye, S.M.[Sen-Mao],
Qin, X.[Xinru],
Sun, K.[Kelei],
Enhancing fine-detail image synthesis from text descriptions by text
aggregation and connection fusion module,
SP:IC(122), 2024, pp. 117099.
Elsevier DOI
2402
Generative adversarial network, Semantic consistency,
Spatial attention, Text-to-image generation, Single-stage network
BibRef
Hu, Y.[Yaosi],
Luo, C.[Chong],
Chen, Z.Z.[Zhen-Zhong],
A Benchmark for Controllable Text-Image-to-Video Generation,
MultMed(26), 2024, pp. 1706-1719.
IEEE DOI
2402
Task analysis, Measurement, Generators, Uncertainty, Visualization, Dynamics,
Benchmark testing, Video generation, text-image-to-video,
multimodal-conditioned generation
BibRef
Han, G.[Guang],
Lin, M.[Min],
Li, Z.Y.[Zi-Yang],
Zhao, H.T.[Hai-Tao],
Kwong, S.[Sam],
Text-to-Image Person Re-Identification Based on Multimodal Graph
Convolutional Network,
MultMed(26), 2024, pp. 6025-6036.
IEEE DOI
2404
Feature extraction, Task analysis, Visualization, Semantics,
Graph neural networks, Data mining, graph convolutional network
BibRef
Zhou, Y.[Yan],
Qian, J.[Jiechang],
Zhang, H.[Huaidong],
Xu, X.[Xuemiao],
Sun, H.[Huajie],
Zeng, F.[Fanzhi],
Zhou, Y.X.[Yue-Xia],
Adaptive multi-text union for stable text-to-image synthesis learning,
PR(152), 2024, pp. 110438.
Elsevier DOI
2405
Adaptive multi-text union learning, Text-to-image synthesis,
Cross-modal generation
BibRef
Yang, B.[Bing],
Xiang, X.Q.[Xue-Qin],
Kong, W.Z.[Wang-Zeng],
Zhang, J.H.[Jian-Hai],
Peng, Y.[Yong],
DMF-GAN: Deep Multimodal Fusion Generative Adversarial Networks for
Text-to-Image Synthesis,
MultMed(26), 2024, pp. 6956-6967.
IEEE DOI
2405
Semantics, Generative adversarial networks, Generators, Training,
Visualization, Image synthesis, Fuses, Deep multimodal fusion,
text-to-image (T2I) synthesis
BibRef
Tang, H.[Hao],
Shao, L.[Ling],
Sebe, N.[Nicu],
Van Gool, L.J.[Luc J.],
Graph Transformer GANs With Graph Masked Modeling for Architectural
Layout Generation,
PAMI(46), No. 6, June 2024, pp. 4298-4313.
IEEE DOI
2405
Layout, Transformers, Task analysis,
Generative adversarial networks, Generators, Semantics, Buildings,
architectural layout generation
BibRef
Dong, P.[Pei],
Wu, L.[Lei],
Li, R.C.[Rui-Chen],
Meng, X.X.[Xiang-Xu],
Meng, L.[Lei],
Text to image synthesis with multi-granularity feature aware
enhancement Generative Adversarial Networks,
CVIU(245), 2024, pp. 104042.
Elsevier DOI
2406
Generative adversarial network,
Multi-granularity feature aware enhancement, Text-to-image, Diffusion
BibRef
Tan, H.C.[Hong-Chen],
Yin, B.C.[Bao-Cai],
Xu, K.Q.[Kai-Qiang],
Wang, H.S.[Hua-Sheng],
Liu, X.P.[Xiu-Ping],
Li, X.[Xin],
Attention-Bridged Modal Interaction for Text-to-Image Generation,
CirSysVideo(34), No. 7, July 2024, pp. 5400-5413.
IEEE DOI
2407
Semantics, Task analysis, Visualization, Computational modeling,
Image synthesis, Generators, Layout,
residual perception discriminator
BibRef
Baraheem, S.S.[Samah S.],
Nguyen, T.V.[Tam V.],
S5: Sketch-to-Image Synthesis via Scene and Size Sensing,
MultMedMag(31), No. 2, April 2024, pp. 7-16.
IEEE DOI
2408
Image synthesis, Instance segmentation, Feature extraction,
Semantics, Image edge detection, Task analysis, Image analysis
BibRef
Zhao, L.[Liang],
Huang, P.[Pingda],
Chen, T.T.[Teng-Tuo],
Fu, C.J.[Chun-Jiang],
Hu, Q.H.[Qing-Hao],
Zhang, Y.Q.[Yang-Qianhui],
Multi-Sentence Complementarily Generation for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 8323-8332.
IEEE DOI
2408
Semantics, Birds, Generative adversarial networks,
Feature extraction, Task analysis, Image synthesis, Generators,
text-to-image
BibRef
Zhao, L.[Liang],
Hu, Q.[Qinghao],
Li, X.Y.[Xiao-Yuan],
Zhao, J.Y.[Jing-Yuan],
Multimodal Fusion Generative Adversarial Network for Image Synthesis,
SPLetters(31), 2024, pp. 1865-1869.
IEEE DOI
2408
Image synthesis, Semantics, Image quality,
Generative adversarial networks, Attention mechanisms,
text-to-image synthesis
BibRef
Wu, Z.Y.[Zhen-Yu],
Wang, Z.W.[Zi-Wei],
Liu, S.Y.[Sheng-Yu],
Luo, H.[Hao],
Lu, J.W.[Ji-Wen],
Yan, H.B.[Hai-Bin],
FairScene: Learning unbiased object interactions for indoor scene
synthesis,
PR(156), 2024, pp. 110737.
Elsevier DOI
2408
Indoor scene synthesis, Graph neural networks, Causal inference
BibRef
Nazarieh, F.[Fatemeh],
Feng, Z.H.[Zhen-Hua],
Awais, M.[Muhammad],
Wang, W.W.[Wen-Wu],
Kittler, J.V.[Josef V.],
A Survey of Cross-Modal Visual Content Generation,
CirSysVideo(34), No. 8, August 2024, pp. 6814-6832.
IEEE DOI
2408
Visualization, Surveys, Data models, Task analysis, Measurement,
Training, Generative adversarial networks, Generative models,
visual content generation
BibRef
Wang, Y.X.[Yi-Xuan],
Zhou, W.G.[Wen-Gang],
Bao, J.M.[Jian-Min],
Wang, W.[Weilun],
Li, L.[Li],
Li, H.Q.[Hou-Qiang],
CLIP2GAN: Toward Bridging Text With the Latent Space of GANs,
CirSysVideo(34), No. 8, August 2024, pp. 6847-6859.
IEEE DOI
2408
Image synthesis, Training, Hair, Task analysis, Faces, Codes,
Visualization, Text-guided image generation, image editing,
generative adversarial nets
BibRef
Zhai, Y.K.[Yi-Kui],
Long, Z.H.[Zhi-Hao],
Pan, W.F.[Wen-Feng],
Chen, C.L.P.[C. L. Philip],
Mutual Information Compensation for High-Fidelity Image Generation
With Limited Data,
SPLetters(31), 2024, pp. 2145-2149.
IEEE DOI
2409
Mutual information, Training, Generators, Image synthesis,
Image resolution, Generative adversarial networks, wavelet transform
BibRef
Li, Z.Y.[Zhuo-Yuan],
Sun, Y.[Yi],
Parameter efficient finetuning of text-to-image models with trainable
self-attention layer,
IVC(151), 2024, pp. 105296.
Elsevier DOI
2411
T2I models, Efficient finetuning, Attention control
BibRef
Croitoru, F.A.[Florinel-Alin],
Hondru, V.[Vlad],
Ionescu, R.T.[Radu Tudor],
Shah, M.[Mubarak],
Reverse Stable Diffusion: What prompt was used to generate this
image?,
CVIU(249), 2024, pp. 104210.
Elsevier DOI Code:
WWW Link.
2412
Diffusion models, Reverse engineering,
Image-to-prompt prediction, Text-to-image generation
BibRef
Ibarrola, F.[Francisco],
Lulham, R.[Rohan],
Grace, K.[Kazjon],
Affect-Conditioned Image Generation,
AffCom(15), No. 4, October 2024, pp. 2169-2179.
IEEE DOI
2412
Training, Semantics, Predictive models, Creativity,
Computational modeling, Task analysis, Neural networks,
semantic models
BibRef
Ahmed, Y.A.[Yeruru Asrar],
Mittal, A.[Anurag],
Unsupervised Co-Generation of Foreground-Background Segmentation from
Text-to-Image Synthesis,
CVIU(250), 2025, pp. 104223.
Elsevier DOI
2501
BibRef
Earlier:
WACV24(5046-5057)
IEEE DOI
2404
Text-to-Image generations, GANs, Generative Adversarial Networks.
Training, Image segmentation, Visualization,
Computational modeling, Training data, Computer architecture,
Vision + language and/or other modalities
BibRef
Li, A.[Ailin],
Zhao, L.[Lei],
Zuo, Z.W.[Zhi-Wen],
Xing, W.[Wei],
Lu, D.M.[Dong-Ming],
Specific Diverse Text-to-Image Synthesis via Exemplar Guidance,
MultMedMag(31), No. 4, October 2024, pp. 37-48.
IEEE DOI
2501
Visualization, Task analysis, Semantics, Image synthesis, Generators,
Training, Vectors
BibRef
Zhang, Y.[Yue],
Peng, C.T.[Cheng-Tao],
Wang, Q.[Qiuli],
Song, D.[Dan],
Li, K.[Kaiyan],
Zhou, S.K.[S. Kevin],
Unified Multi-Modal Image Synthesis for Missing Modality Imputation,
MedImg(44), No. 1, January 2025, pp. 4-18.
IEEE DOI
2501
Image synthesis, Imputation, Medical diagnostic imaging,
Task analysis, Streams, Training, Feature extraction, data imputation
BibRef
Stracke, N.[Nick],
Baumann, S.A.[Stefan Andreas],
Susskind, J.[Joshua],
Bautista, M.A.[Miguel Angel],
Ommer, B.[Björn],
CTRLorALTer: Conditional LorALTer for Efficient 0-shot Control and
Altering of T2I Models,
ECCV24(LXXXVIII: 87-103).
Springer DOI
2412
BibRef
Hemmat, R.A.[Reyhane Askari],
Hall, M.[Melissa],
Sun, A.[Alicia],
Ross, C.[Candace],
Drozdzal, M.[Michal],
Romero-Soriano, A.[Adriana],
Improving Geo-diversity of Generated Images with Contextualized Vendi
Score Guidance,
ECCV24(LXXXVII: 213-229).
Springer DOI
2412
Code:
WWW Link.
BibRef
Li, P.Z.[Peng-Zhi],
Nie, Q.[Qiang],
Chen, Y.[Ying],
Jiang, X.[Xi],
Wu, K.[Kai],
Lin, Y.[Yuhuan],
Liu, Y.[Yong],
Peng, J.L.[Jin-Long],
Wang, C.J.[Cheng-Jie],
Zheng, F.[Feng],
Tuning-free Image Customization with Image and Text Guidance,
ECCV24(LXXVI: 233-250).
Springer DOI
2412
Project:
WWW Link. Guided customation.
BibRef
Liu, J.Q.[Jia-Qi],
Huang, T.[Tao],
Xu, C.[Chang],
Training-free Composite Scene Generation for Layout-to-image Synthesis,
ECCV24(LXVIII: 37-53).
Springer DOI
2412
BibRef
Hong, Y.[Yan],
Duan, Y.X.[Yu-Xuan],
Zhang, B.[Bo],
Chen, H.X.[Hao-Xing],
Lan, J.[Jun],
Zhu, H.[Huijia],
Wang, W.Q.[Wei-Qiang],
Zhang, J.[Jianfu],
Comfusion: Enhancing Personalized Generation by Instance-scene
Compositing and Fusion,
ECCV24(XLIV: 1-18).
Springer DOI
2412
Personalized.
BibRef
Xue, X.T.[Xiang-Tian],
Wu, J.S.[Jia-Song],
Kong, Y.Y.[You-Yong],
Senhadji, L.[Lotfi],
Shu, H.Z.[Hua-Zhong],
ST-LDM: A Universal Framework for Text-grounded Object Generation in
Real Images,
ECCV24(XLVI: 145-162).
Springer DOI
2412
BibRef
Wu, Z.F.[Zhi-Fan],
Huang, L.H.[Liang-Hua],
Wang, W.[Wei],
Wei, Y.H.[Yan-Heng],
Liu, Y.[Yu],
MultiGen: Zero-Shot Image Generation from Multi-Modal Prompts,
ECCV24(VIII: 297-313).
Springer DOI
2412
BibRef
Seol, J.[Jaejung],
Kim, S.[Seojun],
Yoo, J.[Jaejun],
Posterllama: Bridging Design Ability of Language Model to Content-aware
Layout Generation,
ECCV24(LXXXII: 451-468).
Springer DOI
2412
BibRef
Guerreiro, J.J.A.[Julian Jorge Andrade],
Inoue, N.[Naoto],
Masui, K.[Kento],
Otani, M.[Mayu],
Nakayama, H.[Hideki],
Layoutflow: Flow Matching for Layout Generation,
ECCV24(XXXVI: 56-72).
Springer DOI
2412
BibRef
Sun, Q.[Qi],
Zhou, H.[Hang],
Zhou, W.G.[Wen-Gang],
Li, L.[Li],
Li, H.Q.[Hou-Qiang],
Forest2seq: Revitalizing Order Prior for Sequential Indoor Scene
Synthesis,
ECCV24(XXV: 251-268).
Springer DOI
2412
BibRef
Wei, Y.X.[Yu-Xiang],
Ji, Z.L.[Zhi-Ling],
Bai, J.F.[Jin-Feng],
Zhang, H.Z.[Hong-Zhi],
Zhang, L.[Lei],
Zuo, W.M.[Wang-Meng],
Masterweaver: Taming Editability and Face Identity for Personalized
Text-to-image Generation,
ECCV24(LI: 252-271).
Springer DOI
2412
BibRef
Kwon, M.[Mingi],
Oh, S.W.[Seoung Wug],
Zhou, Y.[Yang],
Liu, D.[Difan],
Lee, J.Y.[Joon-Young],
Cai, H.R.[Hao-Ran],
Liu, B.[Baqiao],
Liu, F.[Feng],
Uh, Y.J.[Young-Jung],
Harivo: Harnessing Text-to-image Models for Video Generation,
ECCV24(LIII: 19-36).
Springer DOI
2412
BibRef
Lee, S.H.[Seung Hyun],
Li, Y.[Yinxiao],
Ke, J.J.[Jun-Jie],
Yoo, I.[Innfarn],
Zhang, H.[Han],
Yu, J.[Jiahui],
Wang, Q.F.[Qi-Fei],
Deng, F.[Fei],
Entis, G.[Glenn],
He, J.F.[Jun-Feng],
Li, G.[Gang],
Kim, S.[Sangpil],
Essa, I.[Irfan],
Yang, F.[Feng],
Parrot: Pareto-optimal Multi-reward Reinforcement Learning Framework
for Text-to-image Generation,
ECCV24(XXXVIII: 462-478).
Springer DOI
2412
BibRef
Zheng, A.Y.J.[Amber Yi-Jia],
Yeh, R.A.[Raymond A.],
IMMA: Immunizing Text-to-image Models Against Malicious Adaptation,
ECCV24(XXXIX: 458-475).
Springer DOI
2412
BibRef
Chen, J.S.[Jun-Song],
Ge, C.J.[Chong-Jian],
Xie, E.[Enze],
Wu, Y.[Yue],
Yao, L.W.[Le-Wei],
Ren, X.Z.[Xiao-Zhe],
Wang, Z.[Zhongdao],
Luo, P.[Ping],
Lu, H.C.[Hu-Chuan],
Li, Z.G.[Zhen-Guo],
Pixart-sigma: Weak-to-strong Training of Diffusion Transformer for 4k
Text-to-image Generation,
ECCV24(XXXII: 74-91).
Springer DOI
2412
BibRef
Chatterjee, A.[Agneet],
Ben Melech-Stan, G.[Gabriela],
Aflalo, E.[Estelle],
Paul, S.[Sayak],
Ghosh, D.[Dhruba],
Gokhale, T.[Tejas],
Schmidt, L.[Ludwig],
Hajishirzi, H.[Hannaneh],
Lal, V.[Vasudev],
Baral, C.[Chitta],
Yang, Y.Z.[Ye-Zhou],
Getting it Right: Improving Spatial Consistency in Text-to-image Models,
ECCV24(XXII: 204-222).
Springer DOI
2412
BibRef
Liu, R.T.[Run-Tao],
Khakzar, A.[Ashkan],
Gu, J.D.[Jin-Dong],
Chen, Q.F.[Qi-Feng],
Torr, P.H.S.[Philip H.S.],
Pizzati, F.[Fabio],
Latent Guard: A Safety Framework for Text-to-image Generation,
ECCV24(XXVI: 93-109).
Springer DOI
2412
BibRef
Wei, F.[Fanyue],
Zeng, W.[Wei],
Li, Z.Y.[Zhen-Yang],
Yin, D.W.[Da-Wei],
Duan, L.X.[Li-Xin],
Li, W.[Wen],
Powerful and Flexible: Personalized Text-to-image Generation via
Reinforcement Learning,
ECCV24(XXVII: 394-410).
Springer DOI
2412
BibRef
Li, H.T.[Han-Ting],
Niu, H.J.[Hong-Jing],
Zhao, F.[Feng],
Stable Preference: Redefining Training Paradigm of Human Preference
Model for Text-to-image Synthesis,
ECCV24(XXVIII: 250-266).
Springer DOI
2412
BibRef
Gal, R.[Rinon],
Lichter, O.[Or],
Richardson, E.[Elad],
Patashnik, O.[Or],
Bermano, A.H.[Amit H.],
Chechik, G.[Gal],
Cohen-Or, D.[Daniel],
LCM-Lookahead for Encoder-based Text-to-image Personalization,
ECCV24(XIV: 322-340).
Springer DOI
2412
BibRef
Dahary, O.[Omer],
Patashnik, O.[Or],
Aberman, K.[Kfir],
Cohen-Or, D.[Daniel],
Be Yourself: Bounded Attention for Multi-subject Text-to-image
Generation,
ECCV24(XIV: 432-448).
Springer DOI
2412
BibRef
Xiong, P.X.[Pei-Xi],
Kozuch, M.[Michael],
Jain, N.[Nilesh],
Textual-visual Logic Challenge:
Understanding and Reasoning in Text-to-image Generation,
ECCV24(V: 318-334).
Springer DOI
2412
BibRef
Sun, Y.[Yanan],
Liu, Y.C.[Yan-Chen],
Tang, Y.[Yinhao],
Pei, W.J.[Wen-Jie],
Chen, K.[Kai],
Anycontrol: Create Your Artwork with Versatile Control on Text-to-image
Generation,
ECCV24(XI: 92-109).
Springer DOI
2412
BibRef
Abdullah, A.[Ahmed],
Ebert, N.[Nikolas],
Wasenmüller, O.[Oliver],
Boosting Few-shot Detection with Large Language Models and
Layout-to-image Synthesis,
ACCV24(VII: 202-219).
Springer DOI
2412
BibRef
Lu, C.Y.[Chen-Yi],
Agarwal, S.[Shubham],
Tanjim, M.M.[Md Mehrab],
Mahadik, K.[Kanak],
Rao, A.[Anup],
Mitra, S.[Subrata],
Saini, S.K.[Shiv Kumar],
Bagchi, S.[Saurabh],
Chaterji, S.[Somali],
Recon: Training-free Acceleration for Text-to-image Synthesis with
Retrieval of Concept Prompt Trajectories,
ECCV24(LIX: 288-306).
Springer DOI
2412
BibRef
Zhao, S.H.[Shi-Hao],
Hao, S.[Shaozhe],
Zi, B.[Bojia],
Xu, H.Z.[Huai-Zhe],
Wong, K.Y.K.[Kwan-Yee K.],
Bridging Different Language Models and Generative Vision Models for
Text-to-image Generation,
ECCV24(LXXXI: 70-86).
Springer DOI
2412
BibRef
Tan, Z.Y.[Zhi-Yu],
Yang, M.[Mengping],
Qin, L.[Luozheng],
Yang, H.[Hao],
Qian, Y.[Ye],
Zhou, Q.[Qiang],
Zhang, C.[Cheng],
Li, H.[Hao],
An Empirical Study and Analysis of Text-to-image Generation Using Large
Language Model-powered Textual Representation,
ECCV24(LXXX: 472-489).
Springer DOI
2412
BibRef
Chinchure, A.[Aditya],
Shukla, P.[Pushkar],
Bhatt, G.[Gaurav],
Salij, K.[Kiri],
Hosanagar, K.[Kartik],
Sigal, L.[Leonid],
Turk, M.[Matthew],
Tibet: Identifying and Evaluating Biases in Text-to-image Generative
Models,
ECCV24(LXXIX: 429-446).
Springer DOI
2412
BibRef
Mittal, S.[Surbhi],
Sudan, A.[Arnav],
Vatsa, M.[Mayank],
Singh, R.[Richa],
Glaser, T.[Tamar],
Hassner, T.[Tal],
Navigating Text-to-image Generative Bias Across Indic Languages,
ECCV24(LXXXVIII: 53-67).
Springer DOI
2412
BibRef
Chang, Y.S.[Ying-Shan],
Zhang, Y.[Yasi],
Fang, Z.Y.[Zhi-Yuan],
Wu, Y.N.[Ying Nian],
Bisk, Y.[Yonatan],
Gao, F.[Feng],
Skews in the Phenomenon Space Hinder Generalization in Text-to-image
Generation,
ECCV24(LXXXVII: 422-439).
Springer DOI
2412
BibRef
Yang, Y.Q.[Yu-Qing],
Moremada, C.[Charuka],
Deligiannis, N.[Nikos],
On the Detection of Images Generated from Text,
ICIP24(3792-3798)
IEEE DOI
2411
Resistance, Visualization, Computational modeling,
Perturbation methods, Noise, Text to image, Detectors, robustness
BibRef
Liu, Z.X.[Zhi-Xuan],
Schaldenbrand, P.[Peter],
Okogwu, B.C.[Beverley-Claire],
Peng, W.X.[Wen-Xuan],
Peng, W.X.[Wen-Xuan],
Yun, Y.[Youngsik],
Hundt, A.[Andrew],
Kim, J.[Jihie],
Oh, J.[Jean],
SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation,
CVPR24(10822-10832)
IEEE DOI Code:
WWW Link.
2410
I.e. cultural biases.
Measurement, Surveys, Image synthesis, Generative AI, Media,
Copyright protection, Data models, Image Synthesis,
Computer Vision for Social Good
BibRef
Zhang, H.[Hang],
Savov, A.[Anton],
Dillenburger, B.[Benjamin],
MaskPLAN: Masked Generative Layout Planning from Partial Input,
CVPR24(8964-8973)
IEEE DOI
2410
Measurement, Training, Layout, Transformers, Planning,
MAE, Generative, Graph, Layout
BibRef
Zhang, Y.X.[Yu-Xuan],
Song, Y.[Yiren],
Liu, J.[JiaMing],
Wang, R.[Rui],
Yu, J.P.[Jin-Peng],
Tang, H.[Hao],
Li, H.X.[Hua-Xia],
Tang, X.[Xu],
Hu, Y.[Yao],
Pan, H.[Han],
Jing, Z.L.[Zhong-Liang],
SSR-Encoder: Encoding Selective Subject Representation for
Subject-Driven Generation,
CVPR24(8069-8078)
IEEE DOI
2410
Code:
WWW Link. Training, Adaptation models, Image coding, Image synthesis,
Ecosystems, Feature extraction
BibRef
Lee, J.[Jumin],
Lee, S.[Sebin],
Jo, C.[Changho],
Im, W.B.[Woo-Bin],
Seon, J.[Juhyeong],
Yoon, S.E.[Sung-Eui],
SemCity: Semantic Scene Generation with Triplane Diffusion,
CVPR24(28337-28347)
IEEE DOI Code:
WWW Link.
2410
Roads, Computational modeling, Semantics, Diffusion processes,
Diffusion models, diffusion models, scene generation,
semantic generation
BibRef
Raistrick, A.[Alexander],
Mei, L.J.[Ling-Jie],
Kayan, K.[Karhan],
Yan, D.[David],
Zuo, Y.M.[Yi-Ming],
Han, B.[Beining],
Wen, H.Y.[Hong-Yu],
Parakh, M.[Meenal],
Alexandropoulos, S.[Stamatis],
Lipson, L.[Lahav],
Ma, Z.[Zeyu],
Deng, J.[Jia],
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural
Generation,
CVPR24(21783-21794)
IEEE DOI
2410
Training, Procedural generation, Licenses, Real-time systems,
Libraries, Generators, Procedural Generation, Indoor, Dataset,
Robotics
BibRef
Lin, Z.Q.[Zhi-Qiu],
Pathak, D.[Deepak],
Li, B.[Baiqi],
Li, J.Y.[Jia-Yao],
Xia, X.[Xide],
Neubig, G.[Graham],
Zhang, P.[Pengchuan],
Ramanan, D.[Deva],
Evaluating Text-to-visual Generation with Image-to-text Generation,
ECCV24(IX: 366-384).
Springer DOI
2412
BibRef
Li, B.[Baiqi],
Lin, Z.Q.[Zhi-Qiu],
Pathak, D.[Deepak],
Li, J.Y.[Jia-Yao],
Fei, Y.X.[Yi-Xin],
Wu, K.[Kewen],
Xia, X.[Xide],
Zhang, P.C.[Peng-Chuan],
Neubig, G.[Graham],
Ramanan, D.[Deva],
Evaluating and Improving Compositional Text-to-Visual Generation,
GenerativeFM24(5290-5301)
IEEE DOI
2410
Measurement, Visualization, Toxicology, Closed box, Footwear,
Cognition
BibRef
Ji, P.L.[Peng-Liang],
Liu, J.[Junchen],
TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with
Generative Foundation Models,
GenerativeFM24(5302-5313)
IEEE DOI
2410
Measurement, Accuracy, Semantics, Benchmark testing, Cognition,
text-to-visual evaluation, benchmark evaluation, multimodal,
generative foundation models
BibRef
Zhao, S.Y.[Shi-Yu],
Zhao, L.[Long],
Kumar, B.G.V.[B.G. Vijay],
Suh, Y.M.[Yu-Min],
Metaxas, D.N.[Dimitris N.],
Chandraker, M.[Manmohan],
Schulter, S.[Samuel],
Generating Enhanced Negatives for Training Language-Based Object
Detectors,
CVPR24(13592-13602)
IEEE DOI Code:
WWW Link.
2410
Training, Vocabulary, Accuracy, Training data, Text to image,
Detectors, Benchmark testing, open-vocabulary object detection,
negative example mining
BibRef
Margaryan, H.[Hovhannes],
Hayrapetyan, D.[Daniil],
Cong, W.[Wenyan],
Wang, Z.Y.[Zhang-Yang],
Shi, H.[Humphrey],
DGBD: Depth Guided Branched Diffusion for Comprehensive
Controllability in Multi-View Generation,
L3D24(747-756)
IEEE DOI
2410
Geometry, Shape, Pipelines, Text to image, Cameras,
controllable multi-view generation, diffusion models,
controllablity in multi-view generation
BibRef
Fan, L.J.[Li-Jie],
Chen, K.[Kaifeng],
Krishnan, D.[Dilip],
Katabi, D.[Dina],
Isola, P.[Phillip],
Tian, Y.[Yonglong],
Scaling Laws of Synthetic Images for Model Training ... for Now,
CVPR24(7382-7392)
IEEE DOI
2410
Training, Computational modeling, Machine vision, Text to image,
Training data, Data models
BibRef
Cai, H.[Han],
Li, M.[Muyang],
Zhang, Q.[Qinsheng],
Liu, M.Y.[Ming-Yu],
Han, S.[Song],
Condition-Aware Neural Network for Controlled Image Generation,
CVPR24(7194-7203)
IEEE DOI
2410
Image synthesis, Computational modeling, Neural networks,
Text to image, Process control, Transformers,
efficient deep learning
BibRef
Qiao, P.[Pengchong],
Shang, L.[Lei],
Liu, C.[Chang],
Sun, B.[Baigui],
Ji, X.Y.[Xiang-Yang],
Chen, J.[Jie],
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes
for One-Shot Subject-Driven Generation,
CVPR24(7215-7224)
IEEE DOI
2410
Codes, Object oriented modeling, Semantics, Buildings, Text to image,
subject-driven generation, derived class
BibRef
Wu, R.Q.[Rui-Qi],
Chen, L.[Liangyu],
Yang, T.[Tong],
Guo, C.[Chunle],
Li, C.Y.[Chong-Yi],
Zhang, X.Y.[Xiang-Yu],
LAMP: Learn A Motion Pattern for Few-Shot Video Generation,
CVPR24(7089-7098)
IEEE DOI Code:
WWW Link.
2410
Training, Computational modeling, Pipelines, Text to image,
Diffusion models, Stability analysis, Quality assessment
BibRef
Zhu, J.Y.[Jia-Yi],
Guo, Q.[Qing],
Juefei-Xu, F.[Felix],
Huang, Y.H.[Yi-Hao],
Liu, Y.[Yang],
Pu, G.[Geguang],
Cosalpure: Learning Concept from Group Images for Robust Co-Saliency
Detection,
CVPR24(3669-3678)
IEEE DOI Code:
WWW Link.
2410
Technological innovation, Purification, Perturbation methods,
Noise, Semantics, Text to image, Object detection
BibRef
Chan, K.C.K.[Kelvin C.K.],
Zhao, Y.[Yang],
Jia, X.[Xuhui],
Yang, M.H.[Ming-Hsuan],
Wang, H.[Huisheng],
Improving Subject-Driven Image Synthesis with Subject-Agnostic
Guidance,
CVPR24(6733-6742)
IEEE DOI
2410
Training, Codes, Image synthesis, Text to image
BibRef
Haji-Ali, M.[Moayed],
Balakrishnan, G.[Guha],
Ordonez, V.[Vicente],
ElasticDiffusion: Training-Free Arbitrary Size Image Generation
Through Global-Local Content Separation,
CVPR24(6603-6612)
IEEE DOI Code:
WWW Link.
2410
Image synthesis, Text to image, Coherence, Diffusion models,
Decoding, Trajectory, Text2Image, Diffusion Models, Image Generation,
Stablediffusion
BibRef
Haydarov, K.[Kilichbek],
Muhamed, A.[Aashiq],
Shen, X.Q.[Xiao-Qian],
Lazarevic, J.[Jovana],
Skorokhodov, I.[Ivan],
Galappaththige, C.J.[Chamuditha Jayanga],
Elhoseiny, M.[Mohamed],
Adversarial Text to Continuous Image Generation,
CVPR24(6316-6326)
IEEE DOI Code:
WWW Link.
2410
Training, Tensors, Image synthesis, Text to image, Modulation, Process control
BibRef
Zhang, C.[Cheng],
Wu, Q.Y.[Qian-Yi],
Gambardella, C.C.[Camilo Cruz],
Huang, X.S.[Xiao-Shui],
Phung, D.[Dinh],
Ouyang, W.L.[Wan-Li],
Cai, J.F.[Jian-Fei],
Taming Stable Diffusion for Text to 360° Panorama Image Generation,
CVPR24(6347-6357)
IEEE DOI
2410
Image synthesis, Layout, Noise reduction, Computer architecture,
Diffusion models, Distortion
BibRef
Tang, J.[Junshu],
Zeng, Y.H.[Yan-Hong],
Fan, K.[Ke],
Wang, X.H.[Xu-Heng],
Dai, B.[Bo],
Chen, K.[Kai],
Ma, L.Z.[Li-Zhuang],
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from
Text,
CVPR24(6243-6253)
IEEE DOI
2410
Training, Geometry, Semantics, Text to image, Production,
Texture generation, diffusion model
BibRef
Hu, H.X.[He-Xiang],
Chan, K.C.K.[Kelvin C.K.],
Su, Y.C.[Yu-Chuan],
Chen, W.[Wenhu],
Li, Y.D.[Yan-Dong],
Sohn, K.[Kihyuk],
Zhao, Y.[Yang],
Ben, X.[Xue],
Gong, B.Q.[Bo-Qing],
Cohen, W.[William],
Chang, M.W.[Ming-Wei],
Jia, X.[Xuhui],
Instruct-Imagen: Image Generation with Multi-modal Instruction,
CVPR24(4754-4763)
IEEE DOI
2410
Training, Adaptation models, Image synthesis, Image edge detection,
Natural languages, Text to image, Diffusion Model,
Generalization to Unseen Tasks
BibRef
Kondapaneni, N.[Neehar],
Marks, M.[Markus],
Knott, M.[Manuel],
Guimaraes, R.[Rogerio],
Perona, P.[Pietro],
Text-Image Alignment for Diffusion-Based Perception,
CVPR24(13883-13893)
IEEE DOI
2410
Visualization, Codes, Semantic segmentation,
Computational modeling, Text to image, Estimation, ADE20K
BibRef
Qiao, R.[Runqi],
Yang, L.[Lan],
Pang, K.Y.[Kai-Yue],
Zhang, H.G.[Hong-Gang],
Making Visual Sense of Oracle Bones for You and Me,
CVPR24(12656-12665)
IEEE DOI Code:
WWW Link.
2410
Training, Heart, Visualization, Semantics, Text to image, Manuals, Bones
BibRef
Shrestha, R.[Robik],
Zou, Y.[Yang],
Chen, Q.Y.[Qiu-Yu],
Li, Z.H.[Zhi-Heng],
Xie, Y.S.[Yu-Sheng],
Deng, S.Q.[Si-Qi],
FairRAG: Fair Human Generation via Fair Retrieval Augmentation,
CVPR24(11996-12005)
IEEE DOI
2410
Visualization, Image synthesis, Image databases, Computational modeling,
training data, Text to image, bias, fairness, generative-ai
BibRef
Jayasumana, S.[Sadeep],
Ramalingam, S.[Srikumar],
Veit, A.[Andreas],
Glasner, D.[Daniel],
Chakrabarti, A.[Ayan],
Kumar, S.[Sanjiv],
Rethinking FID: Towards a Better Evaluation Metric for Image
Generation,
CVPR24(9307-9315)
IEEE DOI Code:
WWW Link.
2410
Measurement, Machine learning algorithms, Image synthesis,
Text to image, Machine learning, Probability distribution, CMMD
BibRef
Wu, Y.[You],
Liu, K.[Kean],
Mi, X.Y.[Xiao-Yue],
Tang, F.[Fan],
Cao, J.[Juan],
Li, J.T.[Jin-Tao],
U-VAP: User-specified Visual Appearance Personalization via Decoupled
Self Augmentation,
CVPR24(9482-9491)
IEEE DOI Code:
WWW Link.
2410
Visualization, Semantics, Refining, Text to image,
Aerospace electronics, Controllability
BibRef
Ding, G.G.[Gang-Gui],
Zhao, C.[Canyu],
Wang, W.[Wen],
Yang, Z.[Zhen],
Liu, Z.[Zide],
Chen, H.[Hao],
Shen, C.H.[Chun-Hua],
FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept
Composition,
CVPR24(9089-9098)
IEEE DOI
2410
Training, Codes, Image synthesis, Text to image,
Faces, image customization, diffusion model, generative model
BibRef
Chen, R.D.[Rui-Dong],
Wang, L.[Lanjun],
Nie, W.Z.[Wei-Zhi],
Zhang, Y.D.[Yong-Dong],
Liu, A.A.[An-An],
AnyScene: Customized Image Synthesis with Composited Foreground,
CVPR24(8724-8733)
IEEE DOI
2410
Measurement, Visualization, Image synthesis, Semantics, Layout,
Text to image, text to image generation,
generative model
BibRef
Yang, S.[Shuai],
Zhou, Y.F.[Yi-Fan],
Liu, Z.W.[Zi-Wei],
Loy, C.C.[Chen Change],
Fresco: Spatial-Temporal Correspondence for Zero-Shot Video
Translation,
CVPR24(8703-8712)
IEEE DOI
2410
Training, Visualization, Attention mechanisms, Superresolution,
Text to image, Coherence, diffusion, video-to-video translation,
intra-frame consistency
BibRef
Po, R.[Ryan],
Yang, G.[Guandao],
Aberman, K.[Kfir],
Wetzstein, G.[Gordon],
Orthogonal Adaptation for Modular Customization of Diffusion Models,
CVPR24(7964-7973)
IEEE DOI
2410
Adaptation models, Computational modeling, Scalability, Merging,
Text to image, Interference
BibRef
Bahmani, S.[Sherwin],
Skorokhodov, I.[Ivan],
Rong, V.[Victor],
Wetzstein, G.[Gordon],
Guibas, L.J.[Leonidas J.],
Wonka, P.[Peter],
Tulyakov, S.[Sergey],
Park, J.J.[Jeong Joon],
Tagliasacchi, A.[Andrea],
Lindell, D.B.[David B.],
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling,
CVPR24(7996-8006)
IEEE DOI
2410
Measurement, Training, Solid modeling, Dynamics, Text to image,
Hybrid power systems
BibRef
Horita, D.[Daichi],
Inoue, N.[Naoto],
Kikuchi, K.[Kotaro],
Yamaguchi, K.[Kota],
Aizawa, K.[Kiyoharu],
Retrieval-Augmented Layout Transformer for Content-Aware Layout
Generation,
CVPR24(67-76)
IEEE DOI
2410
Visualization, Layout, Training data, Computer architecture,
Transformers, Generators, layout genration,
content-aware layout generation
BibRef
Zhang, S.[Sixian],
Wang, B.[Bohan],
Wu, J.Q.[Jun-Qiang],
Li, Y.[Yan],
Gao, T.T.[Ting-Ting],
Zhang, D.[Di],
Wang, Z.Y.[Zhong-Yuan],
Learning Multi-Dimensional Human Preference for Text-to-Image
Generation,
CVPR24(8018-8027)
IEEE DOI
2410
Measurement, Image synthesis, Annotations, Computational modeling,
Semantics, Text to image, Text-to-image generation, Evaluation
BibRef
Zhang, Y.M.[Yi-Ming],
Xing, Z.[Zhening],
Zeng, Y.H.[Yan-Hong],
Fang, Y.Q.[You-Qing],
Chen, K.[Kai],
PIA: Your Personalized Image Animator via Plug-and-Play Modules in
Text-to-Image Models,
CVPR24(7747-7756)
IEEE DOI
2410
Text to image, Benchmark testing, Animation, Controllability,
Tuning
BibRef
Huang, S.[Siteng],
Gong, B.[Biao],
Feng, Y.T.[Yu-Tong],
Chen, X.[Xi],
Fu, Y.Q.[Yu-Qian],
Liu, Y.[Yu],
Wang, D.L.[Dong-Lin],
Learning Disentangled Identifiers for Action-Customized Text-to-Image
Generation,
CVPR24(7797-7806)
IEEE DOI Code:
WWW Link.
2410
Animals, Semantics, Text to image, Feature extraction,
Contamination, text-to-image generation,
Action-Disentangled Identifier
BibRef
Chen, Z.J.[Zi-Jie],
Zhang, L.C.[Li-Chao],
Weng, F.S.[Fang-Sheng],
Pan, L.[Lili],
Lan, Z.Z.[Zhen-Zhong],
Tailored Visions: Enhancing Text-to-Image Generation with
Personalized Prompt Rewriting,
CVPR24(7727-7736)
IEEE DOI Code:
WWW Link.
2410
Visualization, Codes, Text to image
BibRef
Qu, L.G.[Lei-Gang],
Wang, W.J.[Wen-Jie],
Li, Y.Q.[Yong-Qi],
Zhang, H.W.[Han-Wang],
Nie, L.Q.[Li-Qiang],
Chua, T.S.[Tat-Seng],
Discriminative Probing and Tuning for Text-to-Image Generation,
CVPR24(7434-7444)
IEEE DOI Code:
WWW Link.
2410
Adaptation models, Large language models, Face recognition,
Computational modeling, Layout, Text to image
BibRef
Cheng, T.Y.[Ta-Ying],
Gadelha, M.[Matheus],
Groueix, T.[Thibault],
Fisher, M.[Matthew],
Mech, R.[Radomír],
Markham, A.[Andrew],
Trigoni, N.[Niki],
Learning Continuous 3D Words for Text-to-Image Generation,
CVPR24(6753-6762)
IEEE DOI Code:
WWW Link.
2410
Image recognition, Image synthesis, Text recognition, Shape,
Text to image, Lighting
BibRef
Ruiz, N.[Nataniel],
Li, Y.Z.[Yuan-Zhen],
Jampani, V.[Varun],
Wei, W.[Wei],
Hou, T.B.[Ting-Bo],
Pritch, Y.[Yael],
Wadhwa, N.[Neal],
Rubinstein, M.[Michael],
Aberman, K.[Kfir],
HyperDreamBooth: HyperNetworks for Fast Personalization of
Text-to-Image Models,
CVPR24(6527-6536)
IEEE DOI
2410
Generative AI, Face recognition, Semantics, Memory management,
Text to image, Graphics processing units, diffusion models,
subject driven personalization
BibRef
Zhang, Y.B.[Yan-Bing],
Yang, M.[Mengping],
Zhou, Q.[Qin],
Wang, Z.[Zhe],
Attention Calibration for Disentangled Text-to-Image Personalization,
CVPR24(4764-4774)
IEEE DOI
2410
Visualization, Solid modeling, Image synthesis, Pipelines,
Text to image, Text-to-image, Personalization, Attention Calibration
BibRef
Burgert, R.D.[Ryan D.],
Price, B.L.[Brian L.],
Kuen, J.[Jason],
Li, Y.J.[Yi-Jun],
Ryoo, M.S.[Michael S.],
MAGICK: A Large-Scale Captioned Dataset from Matting Generated Images
Using Chroma Keying,
CVPR24(22595-22604)
IEEE DOI Code:
WWW Link.
2410
Training, Hair, Image segmentation, Accuracy, Image synthesis,
Text to image, alpha, matting, dataset, generation, text, image,
compositing
BibRef
Dao, T.T.[Trung Tuan],
Vu, D.H.[Duc Hong],
Pham, C.[Cuong],
Tran, A.[Anh],
EFHQ: Multi-Purpose ExtremePose-Face-HQ Dataset,
CVPR24(22605-22615)
IEEE DOI
2410
Training, Deep learning, Face recognition, Pipelines, Text to image,
Benchmark testing
BibRef
Cazenavette, G.[George],
Sud, A.[Avneesh],
Leung, T.[Thomas],
Usman, B.[Ben],
FakeInversion: Learning to Detect Images from Unseen Text-to-Image
Models by Inverting Stable Diffusion,
CVPR24(10759-10769)
IEEE DOI
2410
Training, Visualization, Protocols, Text to image, Detectors,
Benchmark testing, Feature extraction, diffusion, fake detection
BibRef
Jayasumana, S.[Sadeep],
Glasner, D.[Daniel],
Ramalingam, S.[Srikumar],
Veit, A.[Andreas],
Chakrabarti, A.[Ayan],
Kumar, S.[Sanjiv],
MarkovGen: Structured Prediction for Efficient Text-to-Image
Generation,
CVPR24(9316-9325)
IEEE DOI
2410
Training, Image quality, Adaptation models, Image synthesis,
Computational modeling, Text to image, Predictive models, Image generation
BibRef
Ohanyan, M.[Marianna],
Manukyan, H.[Hayk],
Wang, Z.Y.[Zhang-Yang],
Navasardyan, S.[Shant],
Shi, H.[Humphrey],
Zero-Painter: Training-Free Layout Control for Text-to-Image
Synthesis,
CVPR24(8764-8774)
IEEE DOI
2410
Shape, Layout, Text to image
BibRef
Shi, J.[Jing],
Xiong, W.[Wei],
Lin, Z.[Zhe],
Jung, H.J.[Hyun Joon],
InstantBooth: Personalized Text-to-Image Generation without Test-Time
Finetuning,
CVPR24(8543-8552)
IEEE DOI Code:
WWW Link.
2410
Image quality, Adaptation models, Technological innovation,
Image synthesis, Scalability, Text to image, image generation
BibRef
Liang, Y.[Youwei],
He, J.F.[Jun-Feng],
Li, G.[Gang],
Li, P.Z.[Pei-Zhao],
Klimovskiy, A.[Arseniy],
Carolan, N.[Nicholas],
Sun, J.[Jiao],
Pont-Tuset, J.[Jordi],
Young, S.[Sarah],
Yang, F.[Feng],
Ke, J.J.[Jun-Jie],
Dvijotham, K.D.[Krishnamurthy Dj],
Collins, K.M.[Katherine M.],
Luo, Y.W.[Yi-Wen],
Li, Y.[Yang],
Kohlhoff, K.J.[Kai J],
Ramachandran, D.[Deepak],
Navalpakkam, V.[Vidhya],
Rich Human Feedback for Text-to-Image Generation,
CVPR24(19401-19411)
IEEE DOI Code:
WWW Link.
2410
Image synthesis, Large language models, Text to image,
Training data, Reinforcement learning, Predictive models,
rich human feedback
BibRef
Li, X.[Xiang],
Shen, Q.L.[Qian-Li],
Kawaguchi, K.[Kenji],
VA3: Virtually Assured Amplification Attack on Probabilistic
Copyright Protection for Text-to-Image Generative Models,
CVPR24(12363-12373)
IEEE DOI Code:
WWW Link.
2410
Codes, Text to image, Closed box, Copyright protection,
Probabilistic logic, copyright protection,
text-to-image
BibRef
d'Incà, M.[Moreno],
Peruzzo, E.[Elia],
Mancini, M.[Massimiliano],
Xu, D.[Dejia],
Goe, V.[Vidit],
Xu, X.Q.[Xing-Qian],
Wang, Z.Y.[Zhang-Yang],
Shi, H.[Humphrey],
Sebe, N.[Nicu],
OpenBias: Open-Set Bias Detection in Text-to-Image Generative Models,
CVPR24(12225-12235)
IEEE DOI
2410
Limiting, Prevention and mitigation, Large language models,
Pipelines, Knowledge based systems, Text to image, Generative AI,
Text-to-Image
BibRef
Le Coz, A.[Adrien],
Ouertatani, H.[Houssem],
Herbin, S.[Stéphane],
Adjed, F.[Faouzi],
Efficient Exploration of Image Classifier Failures with Bayesian
Optimization and Text-to-Image Models,
GCV24(7569-7578)
IEEE DOI
2410
Training, Costs, Image synthesis, Computational modeling,
Text to image, Benchmark testing, image classifier failures,
bayesian optimization
BibRef
Wang, Y.L.[Yi-Lin],
Xu, H.Y.[Hai-Yang],
Zhang, X.[Xiang],
Chen, Z.[Zeyuan],
Sha, Z.Z.[Zhi-Zhou],
Wang, Z.[Zirui],
Tu, Z.W.[Zhuo-Wen],
OmniControlNet: Dual-stage Integration for Conditional Image
Generation,
GCV24(7436-7448)
IEEE DOI
2410
Image synthesis, Image edge detection, Redundancy, Pipelines,
Text to image, Process control, Predictive models, Generative Models
BibRef
Zhao, Y.Q.[Yi-Qun],
Zhao, Z.[Zibo],
Li, J.[Jing],
Dong, S.[Sixun],
Gao, S.H.[Sheng-Hua],
RoomDesigner: Encoding Anchor-latents for Style-consistent and
Shape-compatible Indoor Scene Generation,
3DV24(1413-1423)
IEEE DOI
2408
Geometry, Shape, Vector quantization, Layout, Predictive models,
Transformers, 3D Scene Generation
BibRef
Ganz, R.[Roy],
Elad, M.[Michael],
CLIPAG: Towards Generator-Free Text-to-Image Generation,
WACV24(3831-3841)
IEEE DOI
2404
Computational modeling, Semantics, Computer architecture,
Generators, Task analysis, Image classification, Algorithms,
Vision + language and/or other modalities
BibRef
Park, S.[Seongbeom],
Moon, S.H.[Su-Hong],
Park, S.H.[Seung-Hyun],
Kim, J.[Jinkyu],
Localization and Manipulation of Immoral Visual Cues for Safe
Text-to-Image Generation,
WACV24(4663-4672)
IEEE DOI
2404
Location awareness, Ethics, Visualization, Analytical models,
Image recognition, Computational modeling, Algorithms, Explainable,
Vision + language and/or other modalities
BibRef
Jeanneret, G.[Guillaume],
Simon, L.[Loïc],
Jurie, F.[Frédéric],
Text-to-Image Models for Counterfactual Explanations:
A Black-Box Approach,
WACV24(4745-4755)
IEEE DOI
2404
Analytical models, Codes, Computational modeling, Closed box,
Computer architecture, Algorithms, Explainable, fair, accountable,
Vision + language and/or other modalities
BibRef
Grimal, P.[Paul],
Borgne, H.L.[Hervé Le],
Ferret, O.[Olivier],
Tourille, J.[Julien],
TIAM - A Metric for Evaluating Alignment in Text-to-Image Generation,
WACV24(2878-2887)
IEEE DOI
2404
Measurement, Image quality, Image color analysis,
Rendering (computer graphics), Colored noise, Algorithms,
Vision + language and/or other modalities
BibRef
Qin, C.[Can],
Yu, N.[Ning],
Xing, C.[Chen],
Zhang, S.[Shu],
Chen, Z.Y.[Ze-Yuan],
Ermon, S.[Stefano],
Fu, Y.[Yun],
Xiong, C.M.[Cai-Ming],
Xu, R.[Ran],
GlueGen: Plug and Play Multi-Modal Encoders for X-to-Image Generation,
ICCV23(23028-23039)
IEEE DOI
2401
BibRef
Bahmani, S.[Sherwin],
Park, J.J.[Jeong Joon],
Paschalidou, D.[Despoina],
Yan, X.G.[Xing-Guang],
Wetzstein, G.[Gordon],
Guibas, L.J.[Leonidas J.],
Tagliasacchi, A.[Andrea],
CC3D: Layout-Conditioned Generation of Compositional 3D Scenes,
ICCV23(7137-7147)
IEEE DOI
2401
BibRef
Lee, T.[Taegyeong],
Kang, J.[Jeonghun],
Kim, H.[Hyeonyu],
Kim, T.[Taehwan],
Generating Realistic Images from In-the-wild Sounds,
ICCV23(7126-7136)
IEEE DOI
2401
BibRef
Ye-Bin, M.[Moon],
Kim, J.[Jisoo],
Kim, H.Y.[Hong-Yeob],
Son, K.[Kilho],
Oh, T.H.[Tae-Hyun],
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation,
ICCV23(2526-2537)
IEEE DOI
2401
BibRef
Ma, Y.W.[Yi-Wei],
Wang, H.[Haowei],
Zhang, X.Q.[Xiao-Qing],
Jiang, G.[Guannan],
Sun, X.S.[Xiao-Shuai],
Zhuang, W.L.[Wei-Lin],
Ji, J.Y.[Jia-Yi],
Ji, R.R.[Rong-Rong],
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via
Dynamic Textual Guidance,
ICCV23(2737-2748)
IEEE DOI Code:
WWW Link.
2401
BibRef
Lin, J.W.[Jia-Wei],
Guo, J.Q.[Jia-Qi],
Sun, S.Z.[Shi-Zhao],
Xu, W.J.[Wei-Jiang],
Liu, T.[Ting],
Lou, J.G.[Jian-Guang],
Zhang, D.M.[Dong-Mei],
A Parse-Then-Place Approach for Generating Graphic Layouts from
Textual Descriptions,
ICCV23(23565-23574)
IEEE DOI
2401
BibRef
Liu, N.[Nan],
Du, Y.L.[Yi-Lun],
Li, S.[Shuang],
Tenenbaum, J.B.[Joshua B.],
Torralba, A.[Antonio],
Unsupervised Compositional Concepts Discovery with Text-to-Image
Generative Models,
ICCV23(2085-2095)
IEEE DOI
2401
BibRef
Wu, X.S.[Xiao-Shi],
Sun, K.Q.[Ke-Qiang],
Zhu, F.[Feng],
Zhao, R.[Rui],
Li, H.S.[Hong-Sheng],
Human Preference Score: Better Aligning Text-to-image Models with
Human Preference,
ICCV23(2096-2105)
IEEE DOI Code:
WWW Link.
2401
BibRef
Le, T.V.[Thanh Van],
Phung, H.[Hao],
Nguyen, T.H.[Thuan Hoang],
Dao, Q.[Quan],
Tran, N.N.[Ngoc N.],
Tran, A.[Anh],
Anti-DreamBooth: Protecting users from personalized text-to-image
synthesis,
ICCV23(2116-2127)
IEEE DOI Code:
WWW Link.
2401
BibRef
Agarwal, A.[Aishwarya],
Karanam, S.[Srikrishna],
Joseph, K.J.,
Saxena, A.[Apoorv],
Goswami, K.[Koustava],
Srinivasan, B.V.[Balaji Vasan],
A-STAR: Test-time Attention Segregation and Retention for
Text-to-image Synthesis,
ICCV23(2283-2293)
IEEE DOI
2401
BibRef
Cho, J.[Jaemin],
Zala, A.[Abhay],
Bansal, M.[Mohit],
DALL-EVAL: Probing the Reasoning Skills and Social Biases of
Text-to-Image Generation Models,
ICCV23(3020-3031)
IEEE DOI
2401
BibRef
Zhang, C.[Cheng],
Chen, X.[Xuanbai],
Chai, S.Q.[Si-Qi],
Wu, C.H.[Chen Henry],
Lagun, D.[Dmitry],
Beeler, T.[Thabo],
de la Torre, F.[Fernando],
ITI-Gen: Inclusive Text-to-Image Generation,
ICCV23(3946-3957)
IEEE DOI
2401
BibRef
Struppek, L.[Lukas],
Hintersdorf, D.[Dominik],
Kersting, K.[Kristian],
Rickrolling the Artist: Injecting Backdoors into Text Encoders for
Text-to-Image Synthesis,
ICCV23(4561-4573)
IEEE DOI Code:
WWW Link.
2401
BibRef
Basu, A.[Abhipsa],
Babu, R.V.[R. Venkatesh],
Pruthi, D.[Danish],
Inspecting the Geographical Representativeness of Images from
Text-to-Image Models,
ICCV23(5113-5124)
IEEE DOI
2401
BibRef
Wang, S.Y.[Sheng-Yu],
Efros, A.A.[Alexei A.],
Zhu, J.Y.[Jun-Yan],
Zhang, R.[Richard],
Evaluating Data Attribution for Text-to-Image Models,
ICCV23(7158-7169)
IEEE DOI
2401
BibRef
Park, M.H.[Min-Ho],
Yun, J.[Jooyeol],
Choi, S.[Seunghwan],
Choo, J.[Jaegul],
Learning to Generate Semantic Layouts for Higher Text-Image
Correspondence in Text-to-Image Synthesis,
ICCV23(7557-7566)
IEEE DOI Code:
WWW Link.
2401
BibRef
Höllein, L.[Lukas],
Cao, A.[Ang],
Owens, A.[Andrew],
Johnson, J.[Justin],
Nießner, M.[Matthias],
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models,
ICCV23(7875-7886)
IEEE DOI
2401
BibRef
Wei, Y.X.[Yu-Xiang],
Zhang, Y.[Yabo],
Ji, Z.L.[Zhi-Long],
Bai, J.F.[Jin-Feng],
Zhang, L.[Lei],
Zuo, W.M.[Wang-Meng],
ELITE: Encoding Visual Concepts into Textual Embeddings for
Customized Text-to-Image Generation,
ICCV23(15897-15907)
IEEE DOI Code:
WWW Link.
2401
BibRef
Bakr, E.M.[Eslam Mohamed],
Sun, P.Z.[Peng-Zhan],
Shen, X.Q.[Xiao-Qian],
Khan, F.F.[Faizan Farooq],
Li, L.E.[Li Erran],
Elhoseiny, M.[Mohamed],
HRS-Bench: Holistic, Reliable and Scalable Benchmark for
Text-to-Image Models,
ICCV23(19984-19996)
IEEE DOI Code:
WWW Link.
2401
BibRef
Lee, J.[Jaewoong],
Jang, S.[Sangwon],
Jo, J.[Jaehyeong],
Yoon, J.[Jaehong],
Kim, Y.J.[Yun-Ji],
Kim, J.H.[Jin-Hwa],
Ha, J.W.[Jung-Woo],
Hwang, S.J.[Sung Ju],
Text-Conditioned Sampling Framework for Text-to-Image Generation with
Masked Generative Models,
ICCV23(23195-23205)
IEEE DOI
2401
BibRef
Hou, X.[Xia],
Sun, M.[Meng],
Song, W.F.[Wen-Feng],
Tell Your Story: Text-Driven Face Video Synthesis with High Diversity
via Adversarial Learning,
ICIP23(515-519)
IEEE DOI Code:
WWW Link.
2312
BibRef
Zhang, Z.Q.[Zhi-Qiang],
Xu, J.Y.[Jia-Yao],
Morita, R.[Ryugo],
Yu, W.X.[Wen-Xin],
Zhou, J.J.[Jin-Jia],
Dynamic Unilateral Dual Learning for Text to Image Synthesis,
ICIP23(1130-1134)
IEEE DOI
2312
BibRef
Mao, J.F.[Jia-Feng],
Wang, X.T.[Xue-Ting],
Training-Free Location-Aware Text-to-Image Synthesis,
ICIP23(995-999)
IEEE DOI
2312
BibRef
Chen, W.J.[Wen-Jie],
Ni, Z.K.[Zhang-Kai],
Wang, H.L.[Han-Li],
Structure-Aware Generative Adversarial Network for Text-to-Image
Generation,
ICIP23(2075-2079)
IEEE DOI
2312
BibRef
Morita, R.[Ryugo],
Zhang, Z.Q.[Zhi-Qiang],
Zhou, J.J.[Jin-Jia],
BATINeT: Background-Aware Text to Image Synthesis and Manipulation
Network,
ICIP23(765-769)
IEEE DOI
2312
BibRef
Yang, S.S.[Shu-Sheng],
Ge, Y.X.[Yi-Xiao],
Yi, K.[Kun],
Li, D.[Dian],
Shan, Y.[Ying],
Qie, X.[Xiaohu],
Wang, X.G.[Xing-Gang],
RILS: Masked Visual Reconstruction in Language Semantic Space,
CVPR23(23304-23314)
IEEE DOI
2309
BibRef
Wei, J.C.[Jia-Cheng],
Wang, H.[Hao],
Feng, J.S.[Jia-Shi],
Lin, G.S.[Guo-Sheng],
Yap, K.H.[Kim-Hui],
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo
Supervision,
CVPR23(16805-16815)
IEEE DOI
2309
BibRef
Zeng, Y.[Yu],
Lin, Z.[Zhe],
Zhang, J.M.[Jian-Ming],
Liu, Q.[Qing],
Collomosse, J.[John],
Kuen, J.[Jason],
Patel, V.M.[Vishal M.],
SceneComposer: Any-Level Semantic Image Synthesis,
CVPR23(22468-22478)
IEEE DOI
2309
BibRef
Lin, J.[Junfan],
Chang, J.L.[Jian-Long],
Liu, L.B.[Ling-Bo],
Li, G.B.[Guan-Bin],
Lin, L.[Liang],
Tian, Q.[Qi],
Chen, C.W.[Chang Wen],
Being Comes from Not-Being: Open-Vocabulary Text-to-Motion Generation
with Wordless Training,
CVPR23(23222-23231)
IEEE DOI
2309
BibRef
Yang, Z.Y.[Zheng-Yuan],
Wang, J.F.[Jian-Feng],
Gan, Z.[Zhe],
Li, L.J.[Lin-Jie],
Lin, K.[Kevin],
Wu, C.[Chenfei],
Duan, N.[Nan],
Liu, Z.C.[Zi-Cheng],
Liu, C.[Ce],
Zeng, M.[Michael],
Wang, L.J.[Li-Juan],
ReCo: Region-Controlled Text-to-Image Generation,
CVPR23(14246-14255)
IEEE DOI
2309
BibRef
Otani, M.[Mayu],
Togashi, R.[Riku],
Sawai, Y.[Yu],
Ishigami, R.[Ryosuke],
Nakashima, Y.[Yuta],
Rahtu, E.[Esa],
Heikkilä, J.[Janne],
Satoh, S.[Shin'ichi],
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image
Generation,
CVPR23(14277-14286)
IEEE DOI
2309
BibRef
Liu, H.[Han],
Wu, Y.H.[Yu-Hao],
Zhai, S.[Shixuan],
Yuan, B.[Bo],
Zhang, N.[Ning],
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image
Generation with Natural Prompts,
CVPR23(20585-20594)
IEEE DOI
2309
BibRef
Kang, M.[Minguk],
Zhu, J.Y.[Jun-Yan],
Zhang, R.[Richard],
Park, J.[Jaesik],
Shechtman, E.[Eli],
Paris, S.[Sylvain],
Park, T.[Taesung],
Scaling up GANs for Text-to-Image Synthesis,
CVPR23(10124-10134)
IEEE DOI
2309
BibRef
Careil, M.[Marlène],
Verbeek, J.[Jakob],
Lathuilière, S.[Stéphane],
Few-shot Semantic Image Synthesis with Class Affinity Transfer,
CVPR23(23611-23620)
IEEE DOI
2309
BibRef
Kang, M.S.[Min-Soo],
Lee, D.[Doyup],
Kim, J.[Jiseob],
Kim, S.[Saehoon],
Han, B.H.[Bo-Hyung],
Variational Distribution Learning for Unsupervised Text-to-Image
Generation,
CVPR23(23380-23389)
IEEE DOI
2309
BibRef
Sung-Bin, K.[Kim],
Senocak, A.[Arda],
Ha, H.W.[Hyun-Woo],
Owens, A.[Andrew],
Oh, T.H.[Tae-Hyun],
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment,
CVPR23(6430-6440)
IEEE DOI
2309
BibRef
Cong, Y.[Yuren],
Yi, J.H.[Jin-Hui],
Rosenhahn, B.[Bodo],
Yang, M.Y.[Michael Ying],
SSGVS: Semantic Scene Graph-to-Video Synthesis,
MULA23(2555-2565)
IEEE DOI
2309
BibRef
Zhang, S.X.[Si-Xian],
Song, X.H.[Xin-Hang],
Li, W.J.[Wei-Jie],
Bai, Y.B.[Yu-Bing],
Yu, X.Y.[Xin-Yao],
Jiang, S.Q.[Shu-Qiang],
Layout-based Causal Inference for Object Navigation,
CVPR23(10792-10802)
IEEE DOI
2309
BibRef
Hsu, H.Y.[Hsiao-Yuan],
He, X.T.[Xiang-Teng],
Peng, Y.X.[Yu-Xin],
Kong, H.[Hao],
Zhang, Q.[Qing],
PosterLayout: A New Benchmark and Approach for Content-Aware
Visual-Textual Presentation Layout,
CVPR23(6018-6026)
IEEE DOI
2309
BibRef
Xue, H.[Han],
Huang, Z.W.[Zhi-Wu],
Sun, Q.[Qianru],
Song, L.[Li],
Zhang, W.J.[Wen-Jun],
Freestyle Layout-to-Image Synthesis,
CVPR23(14256-14266)
IEEE DOI
2309
BibRef
Jiang, Z.Y.[Zhao-Yun],
Guo, J.Q.[Jia-Qi],
Sun, S.Z.[Shi-Zhao],
Deng, H.Y.[Hua-Yu],
Wu, Z.K.[Zhong-Kai],
Mijovic, V.[Vuksan],
Yang, Z.J.J.[Zi-Jiang James],
Lou, J.G.[Jian-Guang],
Zhang, D.M.[Dong-Mei],
LayoutFormer++: Conditional Graphic Layout Generation via Constraint
Serialization and Decoding Space Restriction,
CVPR23(18403-18412)
IEEE DOI
2309
BibRef
Akula, A.R.[Arjun R.],
Driscoll, B.[Brendan],
Narayana, P.[Pradyumna],
Changpinyo, S.[Soravit],
Jia, Z.W.[Zhi-Wei],
Damle, S.[Suyash],
Pruthi, G.[Garima],
Basu, S.[Sugato],
Guibas, L.J.[Leonidas J.],
Freeman, W.T.[William T.],
Li, Y.Z.[Yuan-Zhen],
Jampani, V.[Varun],
MetaCLUE: Towards Comprehensive Visual Metaphors Research,
CVPR23(23201-23211)
IEEE DOI
2309
BibRef
Hwang, I.[Inwoo],
Kim, H.[Hyeonwoo],
Kim, Y.M.[Young Min],
Text2Scene: Text-driven Indoor Scene Stylization with Part-Aware
Details,
CVPR23(1890-1899)
IEEE DOI
2309
BibRef
Li, Y.H.[Yu-Heng],
Liu, H.T.[Hao-Tian],
Wu, Q.Y.[Qing-Yang],
Mu, F.Z.[Fang-Zhou],
Yang, J.W.[Jian-Wei],
Gao, J.F.[Jian-Feng],
Li, C.Y.[Chun-Yuan],
Lee, Y.J.[Yong Jae],
GLIGEN: Open-Set Grounded Text-to-Image Generation,
CVPR23(22511-22521)
IEEE DOI
2309
BibRef
Lai, B.[Borun],
Ma, L.H.[Li-Hong],
Tian, J.[Jing],
Gated Cross Word-visual Attention-driven Generative Adversarial
Networks for Text-to-image Synthesis,
ACCV22(VII:88-100).
Springer DOI
2307
BibRef
Wang, Z.W.[Zhi-Wei],
Yang, J.[Jing],
Cui, J.J.[Jia-Jun],
Liu, J.W.[Jia-Wei],
Wang, J.H.[Jia-Hao],
DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for
Text-to-Image Generation,
ACCV22(VII:3-19).
Springer DOI
2307
BibRef
Liang, M.L.[Ming-Liang],
Liu, Z.R.[Zhuo-Ran],
Larson, M.[Martha],
Textual Concept Expansion with Commonsense Knowledge to Improve
Dual-Stream Image-Text Matching,
MMMod23(I: 421-433).
Springer DOI
2304
Text as input, output concepts
BibRef
Loeschcke, S.[Sebastian],
Belongie, S.[Serge],
Benaim, S.[Sagie],
Text-driven Stylization of Video Objects,
CVEU22(594-609).
Springer DOI
2304
BibRef
Zhou, L.L.[Long-Long],
Wu, X.J.[Xiao-Jun],
Xu, T.Y.[Tian-Yang],
COMIM-GAN: Improved Text-to-Image Generation via Condition Optimization
and Mutual Information Maximization,
MMMod23(I: 385-396).
Springer DOI
2304
BibRef
Lee, H.[Hanbit],
Kim, Y.[Youna],
Lee, S.G.[Sang-Goo],
Multi-scale Contrastive Learning for Complex Scene Generation,
WACV23(764-774)
IEEE DOI
2302
Semantics, Generative adversarial networks, Generators,
Data models, Task analysis, image and video synthesis
BibRef
Kim, J.Y.[Jih-Yun],
Jeong, S.H.[Seong-Hun],
Kong, K.[Kyeongbo],
Kang, S.J.[Suk-Ju],
An Unified Framework for Language Guided Image Completion,
WACV23(2567-2577)
IEEE DOI
2302
Training, Visualization, Image synthesis, Computational modeling,
Natural languages, Complexity theory,
Vision + language and/or other modalities
BibRef
Liao, W.T.[Wen-Tong],
Hu, K.[Kai],
Yang, M.Y.[Michael Ying],
Rosenhahn, B.[Bodo],
Text to Image Generation with Semantic-Spatial Aware GAN,
CVPR22(18166-18175)
IEEE DOI
2210
Visualization, Image recognition, Image synthesis, Fuses,
Computational modeling, Semantics,
Vision+language
BibRef
He, S.[Sen],
Liao, W.T.[Wen-Tong],
Yang, M.Y.[Michael Ying],
Yang, Y.X.[Yong-Xin],
Song, Y.Z.[Yi-Zhe],
Rosenhahn, B.[Bodo],
Xiang, T.[Tao],
Context-Aware Layout to Image Generation with Enhanced Object
Appearance,
CVPR21(15044-15053)
IEEE DOI
2111
Visualization, Image synthesis, Computational modeling, Layout,
Benchmark testing, Inspection, Generators
BibRef
Wang, Z.K.[Ze-Kang],
Liu, L.[Li],
Zhang, H.X.[Hua-Xiang],
Ma, Y.[Yue],
Cui, H.L.[Huai-Lei],
Chen, Y.[Yuan],
Kong, H.R.[Hao-Ran],
Generative Adversarial Networks Based on Dynamic Word-Level Update
for Text-to-Image Synthesis,
ICIVC22(641-647)
IEEE DOI
2301
Training, Image synthesis, Semantics, Benchmark testing,
Generative adversarial networks, Visual effects, Generators,
hierarchical image generation
BibRef
Li, H.[Hui],
Yuan, X.C.[Xu-Chang],
Image Generation Method of Bird Text Based on Improved StackGAN,
ICIVC22(805-811)
IEEE DOI
2301
Training, Image synthesis, Convolution, Computational modeling, Semantics,
Birds, Cultural differences, Text to image, StackGAN, Residual structure
BibRef
Liu, X.[Xian],
Xu, Y.H.[Ying-Hao],
Wu, Q.Y.[Qian-Yi],
Zhou, H.[Hang],
Wu, W.[Wayne],
Zhou, B.[Bolei],
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation,
ECCV22(XXXVII:106-125).
Springer DOI
2211
BibRef
Li, B.[Bowen],
Word-Level Fine-Grained Story Visualization,
ECCV22(XXXVI:347-362).
Springer DOI
2211
BibRef
Tan, R.[Reuben],
Plummer, B.A.[Bryan A.],
Saenko, K.[Kate],
Lewis, J.P.,
Sud, A.[Avneesh],
Leung, T.[Thomas],
NewsStories: Illustrating Articles with Visual Summaries,
ECCV22(XXXVI:644-661).
Springer DOI
2211
BibRef
Roy, P.[Prasun],
Ghosh, S.[Subhankar],
Bhattacharya, S.[Saumik],
Pal, U.[Umapada],
Blumenstein, M.[Michael],
TIPS: Text-Induced Pose Synthesis,
ECCV22(XXXVIII:161-178).
Springer DOI
2211
BibRef
Shi, Z.F.[Zi-Fan],
Shen, Y.J.[Yu-Jun],
Zhu, J.P.[Jia-Peng],
Yeung, D.Y.[Dit-Yan],
Chen, Q.F.[Qi-Feng],
3D-Aware Indoor Scene Synthesis with Depth Priors,
ECCV22(XVI:406-422).
Springer DOI
2211
BibRef
Lee, S.H.[Seung Hyun],
Oh, G.[Gyeongrok],
Byeon, W.[Wonmin],
Kim, C.[Chanyoung],
Ryoo, W.J.[Won Jeong],
Yoon, S.H.[Sang Ho],
Cho, H.[Hyunjun],
Bae, J.Y.[Jih-Yun],
Kim, J.[Jinkyu],
Kim, S.[Sangpil],
Sound-Guided Semantic Video Generation,
ECCV22(XVII:34-50).
Springer DOI
2211
BibRef
Yan, K.[Kun],
Ji, L.[Lei],
Wu, C.F.[Chen-Fei],
Bao, J.M.[Jian-Min],
Zhou, M.[Ming],
Duan, N.[Nan],
Ma, S.[Shuai],
Trace Controlled Text to Image Generation,
ECCV22(XXXVI:59-75).
Springer DOI
2211
BibRef
Dinh, T.M.[Tan M.],
Nguyen, R.[Rang],
Hua, B.S.[Binh-Son],
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation,
ECCV22(XXXVI:594-609).
Springer DOI
2211
BibRef
Zhang, J.H.[Jia-Hui],
Zhan, F.N.[Fang-Neng],
Theobalt, C.[Christian],
Lu, S.J.[Shi-Jian],
Regularized Vector Quantization for Tokenized Image Synthesis,
CVPR23(18467-18476)
IEEE DOI
2309
BibRef
Zhan, F.N.[Fang-Neng],
Zhang, J.H.[Jia-Hui],
Yu, Y.C.[Ying-Chen],
Wu, R.L.[Rong-Liang],
Lu, S.J.[Shi-Jian],
Modulated Contrast for Versatile Image Synthesis,
CVPR22(18259-18269)
IEEE DOI
2210
Photography, Visualization, Codes, Image synthesis, Force,
Performance gain, Image and video synthesis and generation,
Computational photography
BibRef
Qiao, X.T.[Xiao-Tian],
Hancke, G.P.[Gerhard P.],
Lau, R.W.H.[Rynson W.H.],
Learning Object Context for Novel-view Scene Layout Generation,
CVPR22(16969-16978)
IEEE DOI
2210
Computational modeling, Layout, Semantics, Predictive models,
Cameras, Probabilistic logic, Scene analysis and understanding,
Image and video synthesis and generation
BibRef
Ntavelis, E.[Evangelos],
Shahbazi, M.[Mohamad],
Kastanis, I.[Iason],
Timofte, R.[Radu],
Danelljan, M.[Martin],
Van Gool, L.J.[Luc J.],
Arbitrary-Scale Image Synthesis,
CVPR22(11523-11532)
IEEE DOI
2210
Training, Image coding, Image synthesis, Pipelines,
Generative adversarial networks, Encoding,
Image and video synthesis and generation
BibRef
Georgopoulos, M.[Markos],
Oldfield, J.[James],
Chrysos, G.G.[Grigorios G.],
Panagakis, Y.[Yannis],
Cluster-guided Image Synthesis with Unconditional Models,
CVPR22(11533-11542)
IEEE DOI
2210
Hair, Maximum likelihood estimation, Image synthesis, Semantics,
Process control, Generative adversarial networks, Generators,
Explainable computer vision
BibRef
Wei, Y.X.[Yu-Xiang],
Ji, Z.L.[Zhi-Long],
Wu, X.H.[Xiao-He],
Bai, J.F.[Jin-Feng],
Zhang, L.[Lei],
Zuo, W.M.[Wang-Meng],
Inferring and Leveraging Parts from Object Shape for Improving
Semantic Image Synthesis,
CVPR23(11248-11258)
IEEE DOI
2309
BibRef
Lv, Z.Y.[Zheng-Yao],
Wei, Y.X.[Yu-Xiang],
Zuo, W.M.[Wang-Meng],
Wong, K.Y.K.[Kwan-Yee K.],
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis,
CVPR24(9264-9274)
IEEE DOI
2410
Adaptation models, Visualization, Image synthesis, Source coding,
Semantics, Layout, Text to image, semantic image synthesis
BibRef
Lv, Z.Y.[Zheng-Yao],
Li, X.M.[Xiao-Ming],
Niu, Z.X.[Zhen-Xing],
Cao, B.[Bing],
Zuo, W.M.[Wang-Meng],
Semantic-shape Adaptive Feature Modulation for Semantic Image
Synthesis,
CVPR22(11204-11213)
IEEE DOI
2210
Adaptation models, Codes, Shape, Image synthesis, Convolution,
Semantics, Image and video synthesis and generation
BibRef
Shi, Y.P.[Yu-Peng],
Liu, X.[Xiao],
Wei, Y.X.[Yu-Xiang],
Wu, Z.Q.[Zhong-Qin],
Zuo, W.M.[Wang-Meng],
Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis,
CVPR22(11214-11223)
IEEE DOI
2210
Training, Visualization, Image synthesis, Shape, Navigation, Semantics,
Wheels, Image and video synthesis and generation
BibRef
Shim, S.H.[Sang-Heon],
Hyun, S.[Sangeek],
Bae, D.H.[Dae-Hyun],
Heo, J.P.[Jae-Pil],
Local Attention Pyramid for Scene Image Generation,
CVPR22(7764-7772)
IEEE DOI
2210
Measurement, Deep learning, Visualization, Image segmentation,
Image analysis, Image synthesis,
Scene analysis and understanding
BibRef
Wang, B.[Bo],
Wu, T.[Tao],
Zhu, M.[Minfeng],
Du, P.[Peng],
Interactive Image Synthesis with Panoptic Layout Generation,
CVPR22(7773-7782)
IEEE DOI
2210
Visualization, Image synthesis, Shape, Perturbation methods, Layout,
Semantics, Genomics, Image and video synthesis and generation
BibRef
Yang, Z.P.[Zuo-Peng],
Liu, D.Q.[Da-Qing],
Wang, C.Y.[Chao-Yue],
Yang, J.[Jie],
Tao, D.C.[Da-Cheng],
Modeling Image Composition for Complex Scene Generation,
CVPR22(7754-7763)
IEEE DOI
2210
Training, Measurement, Visualization, Image coding, Layout, Genomics,
Predictive models, Image and video synthesis and generation
BibRef
Jeong, J.[Jaebong],
Jo, J.[Janghun],
Cho, S.[Sunghyun],
Park, J.[Jaesik],
3D Scene Painting via Semantic Image Synthesis,
CVPR22(2252-2262)
IEEE DOI
2210
Training, Solid modeling, Image color analysis, Image synthesis,
Machine vision, Semantics, Vision applications and systems, Vision + graphics
BibRef
Aldausari, N.[Nuha],
Sowmya, A.[Arcot],
Marcus, N.[Nadine],
Mohammadi, G.[Gelareh],
Cascaded Siamese Self-supervised Audio to Video GAN,
MULA22(4690-4699)
IEEE DOI
2210
Solid modeling, Correlation, Computational modeling,
Pattern recognition
BibRef
Tao, M.[Ming],
Tang, H.[Hao],
Wu, F.[Fei],
Jing, X.Y.[Xiao-Yuan],
Bao, B.K.[Bing-Kun],
Xu, C.S.[Chang-Sheng],
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis,
CVPR22(16494-16504)
IEEE DOI
2210
Visualization, Codes, Semantics,
Generative adversarial networks, Generators, Vision+language,
Image and video synthesis and generation
BibRef
Zhou, Y.F.[Yu-Fan],
Zhang, R.[Ruiyi],
Chen, C.Y.[Chang-You],
Li, C.Y.[Chun-Yuan],
Tensmeyer, C.[Chris],
Yu, T.[Tong],
Gu, J.X.[Jiu-Xiang],
Xu, J.H.[Jin-Hui],
Sun, T.[Tong],
Towards Language-Free Training for Text-to-Image Generation,
CVPR22(17886-17896)
IEEE DOI
2210
Training, Image synthesis, Semantics, Training data, Tail,
Data collection, Data models, Vision+language,
Image and video synthesis and generation
BibRef
Li, Z.H.[Zhi-Heng],
Min, M.R.[Martin Renqiang],
Li, K.[Kai],
Xu, C.L.[Chen-Liang],
StyleT2I: Toward Compositional and High-Fidelity Text-to-Image
Synthesis,
CVPR22(18176-18186)
IEEE DOI
2210
Measurement, Ethics, Image synthesis, Computational modeling,
Semantics, Robustness, Image and video synthesis and generation,
Vision+language
BibRef
Sanghi, A.[Aditya],
Chu, H.[Hang],
Lambourne, J.G.[Joseph G.],
Wang, Y.[Ye],
Cheng, C.Y.[Chin-Yi],
Fumero, M.[Marco],
Malekshan, K.R.[Kamal Rahimi],
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation,
CVPR22(18582-18592)
IEEE DOI
2210
Training, Point cloud compression, Shape, Semantics,
Natural languages, Vision + graphics,
Vision+language
BibRef
Jain, A.[Ajay],
Mildenhall, B.[Ben],
Barron, J.T.[Jonathan T.],
Abbeel, P.[Pieter],
Poole, B.[Ben],
Zero-Shot Text-Guided Object Generation with Dream Fields,
CVPR22(857-866)
IEEE DOI
2210
Geometry, Visualization, Solid modeling, Image color analysis, Shape,
Deep learning architectures and techniques,
Vision applications and systems
BibRef
Bazazian, D.[Dena],
Calway, A.[Andrew],
Damen, D.[Dima],
Dual-Domain Image Synthesis using Segmentation-Guided GAN,
NTIRE22(506-515)
IEEE DOI
2210
Hair, Training, Image segmentation, Codes, Semantics, Nose, Mouth
BibRef
Yang, Y.Y.[Yu-Yan],
Ni, X.[Xin],
Hao, Y.B.[Yan-Bin],
Liu, C.Y.[Chen-Yu],
Wang, W.S.[Wen-Shan],
Liu, Y.F.[Yi-Feng],
Xi, H.Y.[Hai-Yong],
MF-GAN: Multi-conditional Fusion Generative Adversarial Network for
Text-to-Image Synthesis,
MMMod22(I:41-53).
Springer DOI
2203
Best paper section
BibRef
Wang, Y.[Yi],
Qi, L.[Lu],
Chen, Y.C.[Ying-Cong],
Zhang, X.Y.[Xiang-Yu],
Jia, J.Y.[Jia-Ya],
Image Synthesis via Semantic Composition,
ICCV21(13729-13738)
IEEE DOI
2203
Correlation, Image synthesis, Convolution, Semantics, Layout,
Benchmark testing, Image and video synthesis,
Neural generative models
BibRef
Dhamo, H.[Helisa],
Manhardt, F.[Fabian],
Navab, N.[Nassir],
Tombari, F.[Federico],
Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes
Using Scene Graphs,
ICCV21(16332-16341)
IEEE DOI
2203
Point cloud compression, Visualization, Solid modeling, Shape,
Semantics, Scene analysis and understanding,
BibRef
Li, Z.J.[Ze-Jian],
Wu, J.Y.[Jing-Yu],
Koh, I.[Immanuel],
Tang, Y.C.[Yong-Chuan],
Sun, L.Y.[Ling-Yun],
Image Synthesis from Layout with Locality-Aware Mask Adaption,
ICCV21(13799-13808)
IEEE DOI
2203
Adaptation models, Visualization, Image segmentation,
Image synthesis, Computational modeling, Layout,
Neural generative models
BibRef
Qi, Y.G.[Yong-Gang],
Su, G.Y.[Guo-Yao],
Chowdhury, P.N.[Pinaki Nath],
Li, M.K.[Ming-Kang],
Song, Y.Z.[Yi-Zhe],
SketchLattice: Latticed Representation for Sketch Manipulation,
ICCV21(933-941)
IEEE DOI
2203
Image quality, Limiting, Computational modeling, Lattices,
Task analysis, Vision + other modalities,
Vision applications and systems
BibRef
Yang, L.[Lan],
Pang, K.Y.[Kai-Yue],
Zhang, H.G.[Hong-Gang],
Song, Y.Z.[Yi-Zhe],
SketchAA: Abstract Representation for Abstract Sketches,
ICCV21(10077-10086)
IEEE DOI
2203
Visualization, Image recognition, Codes, Computational modeling,
Image retrieval, Rendering (computer graphics),
Vision applications and systems
BibRef
Canfes, Z.[Zehranaz],
Atasoy, M.F.[M. Furkan],
Dirik, A.[Alara],
Yanardag, P.[Pinar],
Text and Image Guided 3D Avatar Generation and Manipulation,
WACV23(4410-4420)
IEEE DOI
2302
Solid modeling, Shape, Avatars, Source coding, Pipelines,
Process control, Algorithms: 3D computer vision, Biometrics, face, body pose
BibRef
Kocasari, U.[Umut],
Dirik, A.[Alara],
Tiftikci, M.[Mert],
Yanardag, P.[Pinar],
StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and
Manipulation,
WACV22(3441-3450)
IEEE DOI
2202
Training, Hair, Codes, Image synthesis,
Image color analysis, Semantics, Deep Learning
BibRef
Xiang, X.Y.[Xiao-Yu],
Liu, D.[Ding],
Yang, X.[Xiao],
Zhu, Y.H.[Yi-Heng],
Shen, X.H.[Xiao-Hui],
Allebach, J.P.[Jan P.],
Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis,
WACV22(944-954)
IEEE DOI
2202
Training, Image color analysis, Training data,
Distortion, Generators, Optimization,
Image and Video Synthesis
BibRef
Ivgi, M.[Maor],
Benny, Y.[Yaniv],
Ben-David, A.[Avichai],
Berant, J.[Jonathan],
Wolf, L.B.[Lior B.],
Scene Graph To Image Generation with Contextualized Object Layout
Refinement,
ICIP21(2428-2432)
IEEE DOI
2201
Image synthesis, Layout, Predictive models, Task analysis,
Context modeling, Image Synthesis, Scene Graph, GAN
BibRef
Jeon, E.[Eunyeong],
Kim, K.[Kunhee],
Kim, D.J.[Dai-Jin],
FA-GAN: Feature-Aware GAN for Text to Image Synthesis,
ICIP21(2443-2447)
IEEE DOI
2201
Image synthesis, Natural languages,
Generative adversarial networks, Feature extraction, Generators,
Feature-Aware GAN
BibRef
Zhang, Z.Q.[Zhi-Qiang],
Yu, W.X.[Wen-Xin],
Jiang, N.[Ning],
Zhou, J.J.[Jin-Jia],
Text To Image Synthesis With Erudite Generative Adversarial Networks,
ICIP21(2438-2442)
IEEE DOI
2201
Image synthesis, Generative adversarial networks, Data models,
Task analysis, Text-to-Image Synthesis,
Generative Adversarial Networks
BibRef
Yuan, S.Z.[Shao-Zu],
Dai, A.[Aijun],
Yan, Z.L.[Zhi-Ling],
Guo, Z.[Zehua],
Liu, R.X.[Rui-Xue],
Chen, M.[Meng],
SketchBird: Learning to Generate Bird Sketches from Text,
SHE21(2443-2452)
IEEE DOI
2112
Fuses, Shape, Error analysis, Image edge detection,
Computational modeling
BibRef
Berardi, G.[Gianluca],
Salti, S.[Samuele],
di Stefano, L.[Luigi],
SketchyDepth: from Scene Sketches to RGB-D Images,
SHE21(2414-2423)
IEEE DOI
2112
Training, Geometry, Image synthesis, Annotations, Conferences
BibRef
Lu, X.P.[Xiao-Peng],
Ng, L.[Lynnette],
Fernandez, J.[Jared],
Zhu, H.[Hao],
CIGLI: Conditional Image Generation from Language & Image,
CLVL21(3127-3131)
IEEE DOI
2112
Codes, Image synthesis, Computational modeling,
Semantics, Cognition
BibRef
Dorkenwald, M.[Michael],
Milbich, T.[Timo],
Blattmann, A.[Andreas],
Rombach, R.[Robin],
Derpanis, K.G.[Konstantinos G.],
Ommer, B.[Björn],
Stochastic Image-to-Video Synthesis using cINNs,
CVPR21(3741-3752)
IEEE DOI
2111
Neural networks, Stochastic processes,
Process control, Predictive models, Probabilistic logic
BibRef
Zhang, H.[Han],
Koh, J.Y.[Jing Yu],
Baldridge, J.[Jason],
Lee, H.L.[Hong-Lak],
Yang, Y.F.[Yin-Fei],
Cross-Modal Contrastive Learning for Text-to-Image Generation,
CVPR21(833-842)
IEEE DOI
2111
Image quality, Image synthesis, Computational modeling,
Impedance matching, Semantics, Natural languages,
Generative adversarial networks
BibRef
Koh, J.Y.[Jing Yu],
Baldridge, J.[Jason],
Lee, H.L.[Hong-Lak],
Yang, Y.F.[Yin-Fei],
Text-to-Image Generation Grounded by Fine-Grained User Attention,
WACV21(237-246)
IEEE DOI
2106
Measurement, Image segmentation, Visualization,
Grounding, Natural languages
BibRef
Long, J.[Jia],
Lu, H.T.[Hong-Tao],
Multi-level Gate Feature Aggregation with Spatially Adaptive
Batch-instance Normalization for Semantic Image Synthesis,
MMMod21(I:378-390).
Springer DOI
2106
BibRef
Yan, J.W.[Jia-Wei],
Lin, C.S.[Ci-Siang],
Yang, F.E.[Fu-En],
Li, Y.J.[Yu-Jhe],
Wang, Y.C.A.F.[Yu-Chi-Ang Frank],
Semantics-Guided Representation Learning with Applications to Visual
Synthesis,
ICPR21(7181-7187)
IEEE DOI
2105
Visualization, Interpolation,
Computational modeling, Semantics, Data visualization, Semantic interpolation
BibRef
Tang, S.C.[Shi-Chang],
Zhou, X.[Xu],
He, X.M.[Xu-Ming],
Ma, Y.[Yi],
Disentangled Representation Learning for Controllable Image
Synthesis: An Information-Theoretic Perspective,
ICPR21(10042-10049)
IEEE DOI
2105
Training, Image synthesis, Image color analysis,
Mutual information
BibRef
Ji, Z.Y.[Zhong-Yi],
Wang, W.M.[Wen-Min],
Chen, B.Y.[Bao-Yang],
Han, X.[Xiao],
Text-to-Image Generation via Semi-Supervised Training,
VCIP20(265-268)
IEEE DOI
2102
image classification, learning (artificial intelligence),
text analysis, visual databases, text-to-image generation,
Pseudo Feature
BibRef
Devaranjan, J.[Jeevan],
Kar, A.[Amlan],
Fidler, S.[Sanja],
Meta-SIM2:
Unsupervised Learning of Scene Structure for Synthetic Data Generation,
ECCV20(XVII:715-733).
Springer DOI
2011
WWW Link.
BibRef
Song, Y.Z.[Yun-Zhu],
Tam, Z.R.[Zhi Rui],
Chen, H.J.[Hung-Jen],
Lu, H.H.[Huiao-Han],
Shuai, H.H.[Hong-Han],
Character-preserving Coherent Story Visualization,
ECCV20(XVII:18-33).
Springer DOI
2011
BibRef
Achituve, I.[Idan],
Maron, H.[Haggai],
Chechik, G.[Gal],
Self-Supervised Learning for Domain Adaptation on Point Clouds,
WACV21(123-133)
IEEE DOI
2106
Phase change materials, Training,
Task analysis
BibRef
Herzig, R.[Roei],
Bar, A.[Amir],
Xu, H.J.[Hui-Juan],
Chechik, G.[Gal],
Darrell, T.J.[Trevor J.],
Globerson, A.[Amir],
Learning Canonical Representations for Scene Graph to Image Generation,
ECCV20(XXVI:210-227).
Springer DOI
2011
BibRef
Zheng, H.T.[Hai-Tian],
Liao, H.[Haofu],
Chen, L.[Lele],
Xiong, W.[Wei],
Chen, T.L.[Tian-Lang],
Luo, J.B.[Jie-Bo],
Example-guided Image Synthesis Using Masked Spatial-channel Attention
and Self-supervision,
ECCV20(XIV:422-439).
Springer DOI
2011
BibRef
Mallya, A.[Arun],
Wang, T.C.[Ting-Chun],
Sapra, K.[Karan],
Liu, M.Y.[Ming-Yu],
World-Consistent Video-to-Video Synthesis,
ECCV20(VIII:359-378).
Springer DOI
2011
BibRef
Vo, D.M.[Duc Minh],
Sugimoto, A.[Akihiro],
Visual-relation Conscious Image Generation from Structured-text,
ECCV20(XXVIII:290-306).
Springer DOI
2011
BibRef
Burns, A.[Andrea],
Kim, D.H.[Dong-Hyun],
Wijaya, D.[Derry],
Saenko, K.[Kate],
Plummer, B.A.[Bryan A.],
Learning to Scale Multilingual Representations for Vision-Language
Tasks,
ECCV20(IV:197-213).
Springer DOI
2011
BibRef
Liang, J.D.[Jia-Dong],
Pei, W.J.[Wen-Jie],
Lu, F.[Feng],
Cpgan: Content-parsing Generative Adversarial Networks for
Text-to-image Synthesis,
ECCV20(IV:491-508).
Springer DOI
2011
BibRef
Nawhal, M.[Megha],
Zhai, M.Y.[Meng-Yao],
Lehrmann, A.[Andreas],
Sigal, L.[Leonid],
Mori, G.[Greg],
Generating Videos of Zero-shot Compositions of Actions and Objects,
ECCV20(XII: 382-401).
Springer DOI
2010
BibRef
Huang, H.P.[Hsin-Ping],
Tseng, H.Y.[Hung-Yu],
Lee, H.Y.[Hsin-Ying],
Huang, J.B.[Jia-Bin],
Semantic View Synthesis,
ECCV20(XII: 592-608).
Springer DOI
2010
BibRef
Zhu, Z.[Zhen],
Xu, Z.L.[Zhi-Liang],
You, A.S.[An-Sheng],
Bai, X.[Xiang],
Semantically Multi-Modal Image Synthesis,
CVPR20(5466-5475)
IEEE DOI
2008
Semantics, Task analysis, Convolutional codes, Image generation,
Decoding, Generators, Controllability
BibRef
Luo, A.,
Zhang, Z.,
Wu, J.,
Tenenbaum, J.B.,
End-to-End Optimization of Scene Layout,
CVPR20(3753-3762)
IEEE DOI
2008
Layout, Semantics, Decoding,
Rendering (computer graphics), Solid modeling, Training
BibRef
Gao, C.,
Liu, Q.,
Xu, Q.,
Wang, L.,
Liu, J.,
Zou, C.,
SketchyCOCO: Image Generation From Freehand Scene Sketches,
CVPR20(5173-5182)
IEEE DOI
2008
Image edge detection, Image generation, Training,
Data models, Semantics, Image segmentation
BibRef
Chen, Q.,
Wu, Q.,
Tang, R.,
Wang, Y.,
Wang, S.,
Tan, M.,
Intelligent Home 3D: Automatic 3D-House Design From Linguistic
Descriptions Only,
CVPR20(12622-12631)
IEEE DOI
2008
Layout, Buildings, Linguistics,
Task analysis, Solid modeling
BibRef
Liu, C.,
Mao, Z.,
Zhang, T.,
Xie, H.,
Wang, B.,
Zhang, Y.,
Graph Structured Network for Image-Text Matching,
CVPR20(10918-10927)
IEEE DOI
2008
Visualization, Dogs, Semantics, Sparse matrices,
Image edge detection, Learning systems, Feature extraction
BibRef
Sarafianos, N.,
Xu, X.,
Kakadiaris, I.,
Adversarial Representation Learning for Text-to-Image Matching,
ICCV19(5813-5823)
IEEE DOI
2004
image matching, image representation,
learning (artificial intelligence), Adversarial representation,
Distance measurement
BibRef
Tan, F.[Fuwen],
Feng, S.[Song],
Ordonez, V.[Vicente],
Text2Scene: Generating Compositional Scenes From Textual Descriptions,
CVPR19(6703-6712).
IEEE DOI
2002
BibRef
Yin, G.J.[Guo-Jun],
Liu, B.[Bin],
Sheng, L.[Lu],
Yu, N.H.[Neng-Hai],
Wang, X.G.[Xiao-Gang],
Shao, J.[Jing],
Semantics Disentangling for Text-To-Image Generation,
CVPR19(2322-2331).
IEEE DOI
2002
BibRef
Li, W.B.[Wen-Bo],
Zhang, P.C.[Peng-Chuan],
Zhang, L.[Lei],
Huang, Q.Y.[Qiu-Yuan],
He, X.D.[Xiao-Dong],
Lyu, S.W.[Si-Wei],
Gao, J.F.[Jian-Feng],
Object-Driven Text-To-Image Synthesis via Adversarial Training,
CVPR19(12166-12174).
IEEE DOI
2002
BibRef
Talavera, A.,
Tan, D.S.,
Azcarraga, A.,
Hua, K.,
Layout and Context Understanding for Image Synthesis with Scene
Graphs,
ICIP19(1905-1909)
IEEE DOI
1910
Generative Models, Text-to-Image Synthesis, Scene Graphs
BibRef
Joseph, K.J.,
Pal, A.[Arghya],
Rajanala, S.[Sailaja],
Balasubramanian, V.N.[Vineeth N.],
C4Synth: Cross-Caption Cycle-Consistent Text-to-Image Synthesis,
WACV19(358-366)
IEEE DOI
1904
image capture, image processing, virtual reality, visual databases,
image editing, virtual reality, plausible image,
Data models
BibRef
Zhang, Z.,
Xie, Y.,
Yang, L.,
Photographic Text-to-Image Synthesis with a Hierarchically-Nested
Adversarial Network,
CVPR18(6199-6208)
IEEE DOI
1812
Generators, Training, Image resolution,
Task analysis, Semantics, Measurement
BibRef
Qi, X.,
Chen, Q.,
Jia, J.Y.[Jia-Ya],
Koltun, V.,
Semi-Parametric Image Synthesis,
CVPR18(8808-8816)
IEEE DOI
1812
Image segmentation, Semantics, Layout, Training, Image generation,
Image color analysis, Pipelines
BibRef
Hong, S.H.[Seung-Hoon],
Yang, D.D.[Ding-Dong],
Choi, J.[Jongwook],
Lee, H.L.[Hong-Lak],
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis,
CVPR18(7986-7994)
IEEE DOI
1812
Layout, Generators, Semantics, Shape, Image generation,
Task analysis
BibRef
Sah, S.,
Peri, D.,
Shringi, A.,
Zhang, C.,
Dominguez, M.,
Savakis, A.,
Ptucha, R.,
Semantically Invariant Text-to-Image Generation,
ICIP18(3783-3787)
IEEE DOI
1809
Measurement, Image generation, Generators, Image quality, Detectors,
Visualization, Cost function
BibRef
Kong, C.[Chen],
Lin, D.[Dahua],
Bansal, M.[Mohit],
Urtasun, R.[Raquel],
Fidler, S.[Sanja],
What Are You Talking About? Text-to-Image Coreference,
CVPR14(3558-3565)
IEEE DOI
1409
3D object detection; Text and images; scene understanding
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Diffusion for Description or Text to Image Generation .