11.14.3.4 Text to Image, Layout to Image, Image Based Rendering

Chapter Contents (Back)
Image Based Rendering. Stereo Image Based Rendering. Synthesis. Layout to Image. Image Synthesis. Text to Image.
See also Diffusion for Description or Text to Image Generation.
See also Vision Transformers for Image Generation and Image Synthesis.
See also Adversarial Networks for Image Synthesis, Image Generation.
See also Text to 3D Synthesis, Text to 3D Generation.
See also Text to Video Synthesis, Text to Motion.

Peng, Y.X.[Yu-Xin], Qi, J.W.[Jin-Wei],
Show and Tell in the Loop: Cross-Modal Circular Correlation Learning,
MultMed(21), No. 6, June 2019, pp. 1538-1550.
IEEE DOI 1906
Correlation, Bridges, Logic gates, Semantics, Task analysis, Cognition, Feeds, Circular correlation learning, cross-modal retrieval, text-to-image synthesis BibRef

Zhang, X.W.[Xin-Wei], Wang, J.[Jin], Lu, G.D.[Guo-Dong], Zhang, X.S.[Xu-Sheng],
Pattern understanding and synthesis based on layout tree descriptor,
VC(36), No. 6, June 2020, pp. 1141-1155.
WWW Link. 2005
BibRef

Baraheem, S.S.[Samah S.], Nguyen, T.V.[Tam V.],
Text-to-image via mask anchor points,
PRL(133), 2020, pp. 25-32.
Elsevier DOI 2005
Text-to-image, Mask dataset, Image synthesis, Anchor points BibRef

Chen, Q.[Qi], Wu, Q.[Qi], Chen, J.[Jian], Wu, Q.Y.[Qing-Yao], van den Hengel, A.J.[Anton J.], Tan, M.K.[Ming-Kui],
Scripted Video Generation With a Bottom-Up Generative Adversarial Network,
IP(29), 2020, pp. 7454-7467.
IEEE DOI 2007
Generative adversarial networks, video generation, semantic alignment, temporal coherence BibRef

Yang, M.[Min], Liu, J.H.[Jun-Hao], Shen, Y.[Ying], Zhao, Z.[Zhou], Chen, X.J.[Xiao-Jun], Wu, Q.Y.[Qing-Yao], Li, C.M.[Cheng-Ming],
An Ensemble of Generation- and Retrieval-Based Image Captioning With Dual Generator Generative Adversarial Network,
IP(29), 2020, pp. 9627-9640.
IEEE DOI 2011
Generators, Decoding, Generative adversarial networks, Training, Computational modeling, Task analysis, Image captioning, adversarial learning BibRef

Yuan, M., Peng, Y.,
CKD: Cross-Task Knowledge Distillation for Text-to-Image Synthesis,
MultMed(22), No. 8, August 2020, pp. 1955-1968.
IEEE DOI 2007
Semantics, Visualization, Task analysis, Image synthesis, Generative adversarial networks, Neural networks, image semantic understanding BibRef

Osahor, U., Kazemi, H., Dabouei, A., Nasrabadi, N.,
Quality Guided Sketch-to-Photo Image Synthesis,
Biometrics20(3575-3584)
IEEE DOI 2008
Pattern recognition BibRef

Zhao, B.[Bo], Yin, W.D.[Wei-Dong], Meng, L.L.[Li-Li], Sigal, L.[Leonid],
Layout2image: Image Generation from Layout,
IJCV(128), No. 10-11, November 2020, pp. 2418-2435.
Springer DOI 2009
BibRef
Earlier: A1, A3, A2, A4:
Image Generation From Layout,
CVPR19(8576-8585).
IEEE DOI 2002
BibRef

Sheng, L.[Lu], Pan, J.T.[Jun-Ting], Guo, J.M.[Jia-Ming], Shao, J.[Jing], Loy, C.C.[Chen Change],
High-Quality Video Generation from Static Structural Annotations,
IJCV(128), No. 10-11, November 2020, pp. 2552-2569.
Springer DOI 2009
BibRef

Li, K.[Ke], Peng, S.C.[Shi-Chong], Zhang, T.H.[Tian-Hao], Malik, J.[Jitendra],
Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood Estimation,
IJCV(128), No. 10-11, November 2020, pp. 2607-2628.
Springer DOI 2009
BibRef
Earlier: A1, A3, A4, Only:
Diverse Image Synthesis From Semantic Layouts via Conditional IMLE,
ICCV19(4219-4228)
IEEE DOI 2004
image representation, image segmentation, learning (artificial intelligence), Probabilistic logic BibRef

Arora, H.[Himanshu], Mishra, S.[Saurabh], Peng, S.C.[Shi-Chong], Li, K.[Ke], Mahdavi-Amiri, A.[Ali],
Multimodal Shape Completion via Implicit Maximum Likelihood Estimation,
DLGC22(2957-2966)
IEEE DOI 2210
Point cloud compression, Maximum likelihood estimation, Shape, Conferences BibRef

Gao, L.L.[Lian-Li], Chen, D.Y.[Dai-Yuan], Zhao, Z.[Zhou], Shao, J.[Jie], Shen, H.T.[Heng Tao],
Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis,
PR(110), 2021, pp. 107384.
Elsevier DOI 2011
Text-to-image synthesis, Conditional generative adversarial network (CGAN), Pyramid attentive fusion BibRef

Dong, Y.L.[Yan-Long], Zhang, Y.[Ying], Ma, L.[Lin], Wang, Z.[Zhi], Luo, J.B.[Jie-Bo],
Unsupervised text-to-image synthesis,
PR(110), 2021, pp. 107573.
Elsevier DOI 2011
Text-to-image synthesis, Generative adversarial network (GAN), Unsupervised training BibRef

Yuan, M., Peng, Y.,
Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis,
CirSysVideo(30), No. 11, November 2020, pp. 4258-4268.
IEEE DOI 2011
Visualization, Mutual information, Image synthesis, Task analysis, Training, Bridge circuits, Semantics, Text-to-image synthesis, Bridge-GAN BibRef

Li, R.F.[Rui-Fan], Wang, N.[Ning], Feng, F.X.[Fang-Xiang], Zhang, G.W.[Guang-Wei], Wang, X.J.[Xiao-Jie],
Exploring Global and Local Linguistic Representations for Text-to-Image Synthesis,
MultMed(22), No. 12, December 2020, pp. 3075-3087.
IEEE DOI 2011
Task analysis, Linguistics, Generators, Generative adversarial networks, Training, Correlation, cross-modal BibRef

Li, C.Y.[Chun-Ye], Kong, L.Y.[Li-Ya], Zhou, Z.P.[Zhi-Ping],
Improved-StoryGAN for sequential images visualization,
JVCIR(73), 2020, pp. 102956.
Elsevier DOI 2012
Story visualization, Weighted Activation Degree (WAD), Dilated Convolution, Gated Convolution BibRef

Tan, H., Liu, X., Liu, M., Yin, B., Li, X.,
KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis,
IP(30), 2021, pp. 1275-1290.
IEEE DOI 2012
Task analysis, Semantics, Generators, Generative adversarial networks, Knowledge engineering, alternate attention-transfer mechanism BibRef

Wang, M.[Min], Lang, C.Y.[Cong-Yan], Feng, S.H.[Song-He], Wang, T.[Tao], Jin, Y.[Yi], Li, Y.D.[Yi-Dong],
Text to photo-realistic image synthesis via chained deep recurrent generative adversarial network,
JVCIR(74), 2021, pp. 102955.
Elsevier DOI 2101
Text-to-image synthesis, Logic relationships, Computational bottlenecks, Parameters sharing BibRef

Yang, Y., Wang, L., Xie, D., Deng, C., Tao, D.,
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained Text-to-Image Synthesis,
IP(30), 2021, pp. 2798-2809.
IEEE DOI 2102
Semantics, Task analysis, Visualization, Training, Generative adversarial networks, Correlation, Birds, negative sample learning BibRef

Elu, A.[Aitzol], Azkune, G.[Gorka], de Lacalle, O.L.[Oier Lopez], Arganda-Carreras, I.[Ignacio], Soroa, A.[Aitor], Agirre, E.[Eneko],
Inferring spatial relations from textual descriptions of images,
PR(113), 2021, pp. 107847.
Elsevier DOI 2103
Text-to-image synthesis, Natural language understanding, Spatial relations, Deep learning BibRef

Hu, T.[Tao], Long, C.J.[Cheng-Jiang], Xiao, C.X.[Chun-Xia],
A Novel Visual Representation on Text Using Diverse Conditional GAN for Visual Recognition,
IP(30), 2021, pp. 3499-3512.
IEEE DOI 2103
Use text from social media to train image recognition. Visualization, Feature extraction, Image recognition, Text recognition, Generators, visual recognition BibRef

Yang, C.Y.[Ce-Yuan], Shen, Y.J.[Yu-Jun], Zhou, B.L.[Bo-Lei],
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis,
IJCV(129), No. 5, May 2021, pp. 1451-1466.
Springer DOI 2105
BibRef

Qi, Z.J.[Zhong-Jian], Fan, C.G.[Chao-Gang], Xu, L.F.[Liang-Feng], Li, X.K.[Xin-Ke], Zhan, S.[Shu],
MRP-GAN: Multi-resolution parallel generative adversarial networks for text-to-image synthesis,
PRL(147), 2021, pp. 1-7.
Elsevier DOI 2106
Text-to-image synthesize, Generative adversarial networks, Image generation BibRef

Li, Z.[Zeyu], Deng, C.[Cheng], Yang, E.K.[Er-Kun], Tao, D.C.[Da-Cheng],
Staged Sketch-to-Image Synthesis via Semi-Supervised Generative Adversarial Networks,
MultMed(23), 2021, pp. 2694-2705.
IEEE DOI 2109
Generative adversarial networks, Image generation, Training, Image edge detection, Task analysis, sketch BibRef

Zheng, J.B.[Jian-Bin], Liu, D.Q.[Da-Qing], Wang, C.Y.[Chao-Yue], Hu, M.H.[Ming-Hui], Yang, Z.P.[Zuo-Peng], Ding, C.X.[Chang-Xing], Tao, D.C.[Da-Cheng],
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis,
IJCV(132), No. 1, January 2024, pp. 3537-3565.
Springer DOI 2409
BibRef

Rafique, M.U.[Muhammad Usman], Zhang, Y.[Yu], Brodie, B.[Benjamin], Jacobs, N.[Nathan],
Unifying Guided and Unguided Outdoor Image Synthesis,
NTIRE21(776-785)
IEEE DOI 2109
Training, Image synthesis, Impedance matching, Layout, Benchmark testing, Probabilistic logic BibRef

Wang, M.[Min], Lang, C.Y.[Cong-Yan], Liang, L.Q.[Li-Qian], Lyu, G.[Gengyu], Feng, S.H.[Song-He], Wang, T.[Tao],
Class-Balanced Text to Image Synthesis With Attentive Generative Adversarial Network,
MultMedMag(28), No. 3, July 2021, pp. 21-31.
IEEE DOI 2109
Generative adversarial networks, Training data, Semantics, Text processing, Image synthesis, generative adversarial network, rebalance BibRef

Li, A.[Ailin], Zhao, L.[Lei], Zuo, Z.W.[Zhi-Wen], Wang, Z.Z.[Zhi-Zhong], Chen, H.B.[Hai-Bo], Lu, D.M.[Dong-Ming], Xing, W.[Wei],
Diversified text-to-image generation via deep mutual information estimation,
CVIU(211), 2021, pp. 103259.
Elsevier DOI 2110
Generative Adversarial Nets (GANs), Text-to-image generation, Mutual Information BibRef

Wu, F.X.[Fu-Xiang], Cheng, J.[Jun], Wang, X.C.[Xin-Chao], Wang, L.[Lei], Tao, D.P.[Da-Peng],
Image Hallucination From Attribute Pairs,
Cyber(52), No. 1, January 2022, pp. 568-581.
IEEE DOI 2201
Semantics, Visualization, Generators, Syntactics, Training, Natural language processing, text-to-image synthesis BibRef

Hinz, T.[Tobias], Heinrich, S.[Stefan], Wermter, S.[Stefan],
Semantic Object Accuracy for Generative Text-to-Image Synthesis,
PAMI(44), No. 3, March 2022, pp. 1552-1565.
IEEE DOI 2202
Layout, Semantics, Measurement, Generators, Image resolution, Image quality, Text-to-image synthesis, generative models BibRef

Tan, H.C.[Hong-Chen], Liu, X.P.[Xiu-Ping], Yin, B.C.[Bao-Cai], Li, X.[Xin],
Cross-Modal Semantic Matching Generative Adversarial Networks for Text-to-Image Synthesis,
MultMed(24), 2022, pp. 832-845.
IEEE DOI 2202
Semantics, Task analysis, Generative adversarial networks, Generators, Feature extraction, Visualization, text _CNNs BibRef

Feng, F.X.[Fang-Xiang], Niu, T.R.[Tian-Rui], Li, R.F.[Rui-Fan], Wang, X.J.[Xiao-Jie],
Modality Disentangled Discriminator for Text-to-Image Synthesis,
MultMed(24), No. 2022, pp. 2112-2124.
IEEE DOI 2204
Task analysis, Correlation, Image synthesis, Image reconstruction, Generative adversarial networks, Image representation, multi-modal disentangled representation learning BibRef

Tan, Y.X.[Yong Xuan], Lee, C.P.[Chin Poo], Neo, M.[Mai], Lim, K.M.[Kian Ming],
Text-to-image synthesis with self-supervised learning,
PRL(157), 2022, pp. 119-126.
Elsevier DOI 2205
Text-to-image-synthesis, Generative adversarial network, Self-supervised learning BibRef

Tan, Y.X.[Yong Xuan], Lee, C.P.[Chin Poo], Neo, M.[Mai], Lim, K.M.[Kian Ming], Lim, J.Y.[Jit Yan],
Text-to-image synthesis with self-supervised bi-stage generative adversarial network,
PRL(169), 2023, pp. 43-49.
Elsevier DOI 2305
Text-to-image-synthesis, Generative adversarial network, Self-supervised learning, GAN BibRef

Quan, F.[Fengnan], Lang, B.[Bo], Liu, Y.X.[Yan-Xi],
ARRPNGAN: Text-to-image GAN with attention regularization and region proposal networks,
SP:IC(106), 2022, pp. 116728.
Elsevier DOI 2206
Text-to-image synthesis, Generative adversarial network, Attention model, Region proposal network BibRef

Wang, H.X.[Hong-Xia], Ke, H.[Hao], Liu, C.[Chun],
An embedded method: Improve the relevance of text and face image with enhanced face attributes,
SP:IC(108), 2022, pp. 116815.
Elsevier DOI 2209
Generative adversarial networks, Text-to-image face image generation, Face synthesis, Visual attributes BibRef

Peng, J.[Jun], Zhou, Y.[Yiyi], Sun, X.S.[Xiao-Shuai], Cao, L.J.[Liu-Juan], Wu, Y.J.[Yong-Jian], Huang, F.Y.[Fei-Yue], Ji, R.R.[Rong-Rong],
Knowledge-Driven Generative Adversarial Network for Text-to-Image Synthesis,
MultMed(24), 2022, pp. 4356-4366.
IEEE DOI 2210
Visualization, Generative adversarial networks, Task analysis, Semantics, Measurement, Image synthesis, Feature extraction, pseudo turing test BibRef

Mazaheri, A.[Amir], Shah, M.[Mubarak],
Video Generation from Text Employing Latent Path Construction for Temporal Modeling,
ICPR22(5010-5016)
IEEE DOI 2212
Interpolation, Visualization, Natural languages, Stacking, Machine learning BibRef

Gu, J.J.[Jin-Jing], Wang, H.L.[Han-Li], Fan, R.C.[Rui-Chao],
Coherent Visual Storytelling via Parallel Top-Down Visual and Topic Attention,
CirSysVideo(33), No. 1, January 2023, pp. 257-268.
IEEE DOI 2301
Visualization, Decoding, Neural networks, Coherence, Task analysis, Image sequences, Feature extraction, Visual storytelling, phrase beam search BibRef

Li, T.P.[Teng-Peng], Wang, H.L.[Han-Li], He, B.[Bin], Chen, C.W.[Chang Wen],
Knowledge-Enriched Attention Network With Group-Wise Semantic for Visual Storytelling,
PAMI(45), No. 7, July 2023, pp. 8634-8645.
IEEE DOI 2306
Visualization, Semantics, Feature extraction, Decoding, Streaming media, GSM, Technological innovation, Encoder-decoder, visual storytelling BibRef

Gao, L.[Lin], Sun, J.M.[Jia-Mu], Mo, K.[Kaichun], Lai, Y.K.[Yu-Kun], Guibas, L.J.[Leonidas J.], Yang, J.[Jie],
SceneHGN: Hierarchical Graph Networks for 3D Indoor Scene Generation With Fine-Grained Geometry,
PAMI(45), No. 7, July 2023, pp. 8902-8919.
IEEE DOI 2306
Geometry, Layout, Shape, Solid modeling, Neural networks, Interpolation, 3D indoor scene synthesis, deep generative model, variational autoencoder BibRef

Hou, X.X.[Xian-Xu], Zhang, X.K.[Xiao-Kang], Li, Y.D.[Yu-Dong], Shen, L.L.[Lin-Lin],
TextFace: Text-to-Style Mapping Based Face Generation and Manipulation,
MultMed(25), 2023, pp. 3409-3419.
IEEE DOI 2309
BibRef

Liu, S.Y.[Si-Ying], Dragotti, P.L.[Pier Luigi],
Sensing Diversity and Sparsity Models for Event Generation and Video Reconstruction from Events,
PAMI(45), No. 10, October 2023, pp. 12444-12458.
IEEE DOI 2310
Event to video. BibRef

Tan, Z.R.[Zhao-Rui], Yang, X.[Xi], Ye, Z.H.[Zi-Han], Wang, Q.[Qiufeng], Yan, Y.[Yuyao], Nguyen, A.[Anh], Huang, K.[Kaizhu],
Semantic Similarity Distance: Towards better text-image consistency metric in text-to-image generation,
PR(144), 2023, pp. 109883.
Elsevier DOI 2310
Text-to-image, Image generation, Generative adversarial networks, Semantic consistency BibRef

Cheng, Q.R.[Qing-Rong], Wen, K.Y.[Ke-Yu], Gu, X.D.[Xiao-Dong],
Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks,
MultMed(25), 2023, pp. 7062-7075.
IEEE DOI 2311
BibRef

Gao, L.L.[Lian-Li], Zhao, Q.[Qike], Zhu, J.C.[Jun-Chen], Su, S.[Sitong], Cheng, L.[Lechao], Zhao, L.[Lei],
From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation,
MultMed(25), 2023, pp. 7248-7261.
IEEE DOI Code:
WWW Link. 2311
BibRef

Sun, M.Z.[Ming-Zhen], Wang, W.N.[Wei-Ning], Zhu, X.X.[Xin-Xin], Liu, J.[Jing],
Reparameterizing and dynamically quantizing image features for image generation,
PR(146), 2024, pp. 109962.
Elsevier DOI 2311
Vector quantization, Variational auto-encoder, Unconditional image generation, Text-to-image generation, Autoregressive generation BibRef

Liu, Z.Z.[Zheng-Zhe], Dai, P.[Peng], Li, R.[Ruihui], Qi, X.J.[Xiao-Juan], Fu, C.W.[Chi-Wing],
DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation,
PAMI(45), No. 12, December 2023, pp. 14385-14403.
IEEE DOI 2311
BibRef

Tang, Z.M.[Zheng-Mi], Miyazaki, T.[Tomo], Omachi, S.[Shinichiro],
A Scene-Text Synthesis Engine Achieved Through Learning From Decomposed Real-World Data,
IP(32), 2023, pp. 5837-5851.
IEEE DOI Code:
WWW Link. 2311
BibRef

Xu, Y.H.[Yong-Hao], Yu, W.[Weikang], Ghamisi, P.[Pedram], Kopp, M.[Michael], Hochreiter, S.[Sepp],
Txt2Img-MHN: Remote Sensing Image Generation From Text Using Modern Hopfield Networks,
IP(32), 2023, pp. 5737-5750.
IEEE DOI Code:
WWW Link. 2311
BibRef

Tan, H.C.[Hong-Chen], Yin, B.C.[Bao-Cai], Wei, K.[Kun], Liu, X.P.[Xiu-Ping], Li, X.[Xin],
ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis,
MultMed(25), 2023, pp. 8620-8631.
IEEE DOI 2312
BibRef

Liang, J.D.[Jia-Dong], Pei, W.J.[Wen-Jie], Lu, F.[Feng],
Layout-Bridging Text-to-Image Synthesis,
CirSysVideo(33), No. 12, December 2023, pp. 7438-7451.
IEEE DOI 2312
BibRef

Kuang, Y.[Yi], Ma, F.[Fei], Li, F.F.[Fang-Fang], Liu, Y.B.[Ying-Bing], Zhang, F.[Fan],
Semantic-Layout-Guided Image Synthesis for High-Quality Synthetic-Aperature Radar Detection Sample Generation,
RS(15), No. 24, 2023, pp. 5654.
DOI Link 2401
BibRef

Liu, A.A.[An-An], Sun, Z.F.[Ze-Fang], Xu, N.[Ning], Kang, R.B.[Rong-Bao], Cao, J.[Jinbo], Yang, F.[Fan], Qin, W.J.[Wei-Jun], Zhang, S.Y.[Shen-Yuan], Zhang, J.Q.[Jia-Qi], Li, X.[Xuanya],
Prior knowledge guided text to image generation,
PRL(177), 2024, pp. 89-95.
Elsevier DOI 2401
Text-to-image synthesis, Generative Adversarial Networks, Knowledge Guided GAN BibRef

Köksal, A.[Ali], Ak, K.E.[Kenan E.], Sun, Y.[Ying], Rajan, D.[Deepu], Lim, J.H.[Joo Hwee],
Controllable Video Generation With Text-Based Instructions,
MultMed(26), 2024, pp. 190-201.
IEEE DOI 2401
BibRef

Liu, J.W.[Jia-Wei], Wang, W.N.[Wei-Ning], Chen, S.[Sihan], Zhu, X.X.[Xin-Xin], Liu, J.[Jing],
Sounding Video Generator: A Unified Framework for Text-Guided Sounding Video Generation,
MultMed(26), 2024, pp. 141-153.
IEEE DOI 2401
BibRef

Ye, S.M.[Sen-Mao], Wang, H.[Huan], Tan, M.K.[Ming-Kui], Liu, F.[Fei],
Recurrent Affine Transformation for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 462-473.
IEEE DOI 2402
Generators, Visualization, Fuses, Computational modeling, Generative adversarial networks, Training, Task analysis, spatial attention BibRef

Yuan, B.[Bowen], Sheng, Y.F.[Ye-Fei], Bao, B.K.[Bing-Kun], Chen, Y.P.P.[Yi-Ping Phoebe], Xu, C.S.[Chang-Sheng],
Semantic Distance Adversarial Learning for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 1255-1266.
IEEE DOI 2402
Semantics, Generators, Training, Adversarial machine learning, Feature extraction, Generative adversarial networks, Birds, cycle consistency BibRef

Zhou, H.P.[Hua-Ping], Wu, T.[Tao], Ye, S.M.[Sen-Mao], Qin, X.[Xinru], Sun, K.[Kelei],
Enhancing fine-detail image synthesis from text descriptions by text aggregation and connection fusion module,
SP:IC(122), 2024, pp. 117099.
Elsevier DOI 2402
Generative adversarial network, Semantic consistency, Spatial attention, Text-to-image generation, Single-stage network BibRef

Hu, Y.[Yaosi], Luo, C.[Chong], Chen, Z.Z.[Zhen-Zhong],
A Benchmark for Controllable Text-Image-to-Video Generation,
MultMed(26), 2024, pp. 1706-1719.
IEEE DOI 2402
Task analysis, Measurement, Generators, Uncertainty, Visualization, Dynamics, Benchmark testing, Video generation, text-image-to-video, multimodal-conditioned generation BibRef

Han, G.[Guang], Lin, M.[Min], Li, Z.Y.[Zi-Yang], Zhao, H.T.[Hai-Tao], Kwong, S.[Sam],
Text-to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network,
MultMed(26), 2024, pp. 6025-6036.
IEEE DOI 2404
Feature extraction, Task analysis, Visualization, Semantics, Graph neural networks, Data mining, graph convolutional network BibRef

Zhou, Y.[Yan], Qian, J.[Jiechang], Zhang, H.[Huaidong], Xu, X.[Xuemiao], Sun, H.[Huajie], Zeng, F.[Fanzhi], Zhou, Y.X.[Yue-Xia],
Adaptive multi-text union for stable text-to-image synthesis learning,
PR(152), 2024, pp. 110438.
Elsevier DOI 2405
Adaptive multi-text union learning, Text-to-image synthesis, Cross-modal generation BibRef

Yang, B.[Bing], Xiang, X.Q.[Xue-Qin], Kong, W.Z.[Wang-Zeng], Zhang, J.H.[Jian-Hai], Peng, Y.[Yong],
DMF-GAN: Deep Multimodal Fusion Generative Adversarial Networks for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 6956-6967.
IEEE DOI 2405
Semantics, Generative adversarial networks, Generators, Training, Visualization, Image synthesis, Fuses, Deep multimodal fusion, text-to-image (T2I) synthesis BibRef

Tang, H.[Hao], Shao, L.[Ling], Sebe, N.[Nicu], Van Gool, L.J.[Luc J.],
Graph Transformer GANs With Graph Masked Modeling for Architectural Layout Generation,
PAMI(46), No. 6, June 2024, pp. 4298-4313.
IEEE DOI 2405
Layout, Transformers, Task analysis, Generative adversarial networks, Generators, Semantics, Buildings, architectural layout generation BibRef

Dong, P.[Pei], Wu, L.[Lei], Li, R.C.[Rui-Chen], Meng, X.X.[Xiang-Xu], Meng, L.[Lei],
Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks,
CVIU(245), 2024, pp. 104042.
Elsevier DOI 2406
Generative adversarial network, Multi-granularity feature aware enhancement, Text-to-image, Diffusion BibRef

Tan, H.C.[Hong-Chen], Yin, B.C.[Bao-Cai], Xu, K.Q.[Kai-Qiang], Wang, H.S.[Hua-Sheng], Liu, X.P.[Xiu-Ping], Li, X.[Xin],
Attention-Bridged Modal Interaction for Text-to-Image Generation,
CirSysVideo(34), No. 7, July 2024, pp. 5400-5413.
IEEE DOI 2407
Semantics, Task analysis, Visualization, Computational modeling, Image synthesis, Generators, Layout, residual perception discriminator BibRef

Baraheem, S.S.[Samah S.], Nguyen, T.V.[Tam V.],
S5: Sketch-to-Image Synthesis via Scene and Size Sensing,
MultMedMag(31), No. 2, April 2024, pp. 7-16.
IEEE DOI 2408
Image synthesis, Instance segmentation, Feature extraction, Semantics, Image edge detection, Task analysis, Image analysis BibRef

Zhao, L.[Liang], Huang, P.[Pingda], Chen, T.T.[Teng-Tuo], Fu, C.J.[Chun-Jiang], Hu, Q.H.[Qing-Hao], Zhang, Y.Q.[Yang-Qianhui],
Multi-Sentence Complementarily Generation for Text-to-Image Synthesis,
MultMed(26), 2024, pp. 8323-8332.
IEEE DOI 2408
Semantics, Birds, Generative adversarial networks, Feature extraction, Task analysis, Image synthesis, Generators, text-to-image BibRef

Zhao, L.[Liang], Hu, Q.[Qinghao], Li, X.Y.[Xiao-Yuan], Zhao, J.Y.[Jing-Yuan],
Multimodal Fusion Generative Adversarial Network for Image Synthesis,
SPLetters(31), 2024, pp. 1865-1869.
IEEE DOI 2408
Image synthesis, Semantics, Image quality, Generative adversarial networks, Attention mechanisms, text-to-image synthesis BibRef

Wu, Z.Y.[Zhen-Yu], Wang, Z.W.[Zi-Wei], Liu, S.Y.[Sheng-Yu], Luo, H.[Hao], Lu, J.W.[Ji-Wen], Yan, H.B.[Hai-Bin],
FairScene: Learning unbiased object interactions for indoor scene synthesis,
PR(156), 2024, pp. 110737.
Elsevier DOI 2408
Indoor scene synthesis, Graph neural networks, Causal inference BibRef

Nazarieh, F.[Fatemeh], Feng, Z.H.[Zhen-Hua], Awais, M.[Muhammad], Wang, W.W.[Wen-Wu], Kittler, J.V.[Josef V.],
A Survey of Cross-Modal Visual Content Generation,
CirSysVideo(34), No. 8, August 2024, pp. 6814-6832.
IEEE DOI 2408
Visualization, Surveys, Data models, Task analysis, Measurement, Training, Generative adversarial networks, Generative models, visual content generation BibRef

Wang, Y.X.[Yi-Xuan], Zhou, W.G.[Wen-Gang], Bao, J.M.[Jian-Min], Wang, W.[Weilun], Li, L.[Li], Li, H.Q.[Hou-Qiang],
CLIP2GAN: Toward Bridging Text With the Latent Space of GANs,
CirSysVideo(34), No. 8, August 2024, pp. 6847-6859.
IEEE DOI 2408
Image synthesis, Training, Hair, Task analysis, Faces, Codes, Visualization, Text-guided image generation, image editing, generative adversarial nets BibRef

Zhai, Y.K.[Yi-Kui], Long, Z.H.[Zhi-Hao], Pan, W.F.[Wen-Feng], Chen, C.L.P.[C. L. Philip],
Mutual Information Compensation for High-Fidelity Image Generation With Limited Data,
SPLetters(31), 2024, pp. 2145-2149.
IEEE DOI 2409
Mutual information, Training, Generators, Image synthesis, Image resolution, Generative adversarial networks, wavelet transform BibRef

Li, Z.Y.[Zhuo-Yuan], Sun, Y.[Yi],
Parameter efficient finetuning of text-to-image models with trainable self-attention layer,
IVC(151), 2024, pp. 105296.
Elsevier DOI 2411
T2I models, Efficient finetuning, Attention control BibRef

Croitoru, F.A.[Florinel-Alin], Hondru, V.[Vlad], Ionescu, R.T.[Radu Tudor], Shah, M.[Mubarak],
Reverse Stable Diffusion: What prompt was used to generate this image?,
CVIU(249), 2024, pp. 104210.
Elsevier DOI Code:
WWW Link. 2412
Diffusion models, Reverse engineering, Image-to-prompt prediction, Text-to-image generation BibRef

Ibarrola, F.[Francisco], Lulham, R.[Rohan], Grace, K.[Kazjon],
Affect-Conditioned Image Generation,
AffCom(15), No. 4, October 2024, pp. 2169-2179.
IEEE DOI 2412
Training, Semantics, Predictive models, Creativity, Computational modeling, Task analysis, Neural networks, semantic models BibRef

Ahmed, Y.A.[Yeruru Asrar], Mittal, A.[Anurag],
Unsupervised Co-Generation of Foreground-Background Segmentation from Text-to-Image Synthesis,
CVIU(250), 2025, pp. 104223.
Elsevier DOI 2501
BibRef
Earlier: WACV24(5046-5057)
IEEE DOI 2404
Text-to-Image generations, GANs, Generative Adversarial Networks. Training, Image segmentation, Visualization, Computational modeling, Training data, Computer architecture, Vision + language and/or other modalities BibRef

Li, A.[Ailin], Zhao, L.[Lei], Zuo, Z.W.[Zhi-Wen], Xing, W.[Wei], Lu, D.M.[Dong-Ming],
Specific Diverse Text-to-Image Synthesis via Exemplar Guidance,
MultMedMag(31), No. 4, October 2024, pp. 37-48.
IEEE DOI 2501
Visualization, Task analysis, Semantics, Image synthesis, Generators, Training, Vectors BibRef

Zhang, Y.[Yue], Peng, C.T.[Cheng-Tao], Wang, Q.[Qiuli], Song, D.[Dan], Li, K.[Kaiyan], Zhou, S.K.[S. Kevin],
Unified Multi-Modal Image Synthesis for Missing Modality Imputation,
MedImg(44), No. 1, January 2025, pp. 4-18.
IEEE DOI 2501
Image synthesis, Imputation, Medical diagnostic imaging, Task analysis, Streams, Training, Feature extraction, data imputation BibRef


Ban, Y.H.[Yuan-Hao], Wang, R.C.[Ruo-Chen], Zhou, T.Y.[Tian-Yi], Cheng, M.[Minhao], Gong, B.Q.[Bo-Qing], Hsieh, C.J.[Cho-Jui],
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?,
ECCV24(LXXXIX: 190-206).
Springer DOI 2412
specify what to exclude from the generated images BibRef

Stracke, N.[Nick], Baumann, S.A.[Stefan Andreas], Susskind, J.[Joshua], Bautista, M.A.[Miguel Angel], Ommer, B.[Björn],
CTRLorALTer: Conditional LorALTer for Efficient 0-shot Control and Altering of T2I Models,
ECCV24(LXXXVIII: 87-103).
Springer DOI 2412
BibRef

Hemmat, R.A.[Reyhane Askari], Hall, M.[Melissa], Sun, A.[Alicia], Ross, C.[Candace], Drozdzal, M.[Michal], Romero-Soriano, A.[Adriana],
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance,
ECCV24(LXXXVII: 213-229).
Springer DOI 2412
Code:
WWW Link. BibRef

Li, P.Z.[Peng-Zhi], Nie, Q.[Qiang], Chen, Y.[Ying], Jiang, X.[Xi], Wu, K.[Kai], Lin, Y.[Yuhuan], Liu, Y.[Yong], Peng, J.L.[Jin-Long], Wang, C.J.[Cheng-Jie], Zheng, F.[Feng],
Tuning-free Image Customization with Image and Text Guidance,
ECCV24(LXXVI: 233-250).
Springer DOI 2412
Project:
WWW Link. Guided customation. BibRef

Liu, J.Q.[Jia-Qi], Huang, T.[Tao], Xu, C.[Chang],
Training-free Composite Scene Generation for Layout-to-image Synthesis,
ECCV24(LXVIII: 37-53).
Springer DOI 2412
BibRef

Hong, Y.[Yan], Duan, Y.X.[Yu-Xuan], Zhang, B.[Bo], Chen, H.X.[Hao-Xing], Lan, J.[Jun], Zhu, H.[Huijia], Wang, W.Q.[Wei-Qiang], Zhang, J.[Jianfu],
Comfusion: Enhancing Personalized Generation by Instance-scene Compositing and Fusion,
ECCV24(XLIV: 1-18).
Springer DOI 2412
Personalized. BibRef

Xue, X.T.[Xiang-Tian], Wu, J.S.[Jia-Song], Kong, Y.Y.[You-Yong], Senhadji, L.[Lotfi], Shu, H.Z.[Hua-Zhong],
ST-LDM: A Universal Framework for Text-grounded Object Generation in Real Images,
ECCV24(XLVI: 145-162).
Springer DOI 2412
BibRef

Wu, Z.F.[Zhi-Fan], Huang, L.H.[Liang-Hua], Wang, W.[Wei], Wei, Y.H.[Yan-Heng], Liu, Y.[Yu],
MultiGen: Zero-Shot Image Generation from Multi-Modal Prompts,
ECCV24(VIII: 297-313).
Springer DOI 2412
BibRef

Seol, J.[Jaejung], Kim, S.[Seojun], Yoo, J.[Jaejun],
Posterllama: Bridging Design Ability of Language Model to Content-aware Layout Generation,
ECCV24(LXXXII: 451-468).
Springer DOI 2412
BibRef

Guerreiro, J.J.A.[Julian Jorge Andrade], Inoue, N.[Naoto], Masui, K.[Kento], Otani, M.[Mayu], Nakayama, H.[Hideki],
Layoutflow: Flow Matching for Layout Generation,
ECCV24(XXXVI: 56-72).
Springer DOI 2412
BibRef

Sun, Q.[Qi], Zhou, H.[Hang], Zhou, W.G.[Wen-Gang], Li, L.[Li], Li, H.Q.[Hou-Qiang],
Forest2seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis,
ECCV24(XXV: 251-268).
Springer DOI 2412
BibRef

Wei, Y.X.[Yu-Xiang], Ji, Z.L.[Zhi-Ling], Bai, J.F.[Jin-Feng], Zhang, H.Z.[Hong-Zhi], Zhang, L.[Lei], Zuo, W.M.[Wang-Meng],
Masterweaver: Taming Editability and Face Identity for Personalized Text-to-image Generation,
ECCV24(LI: 252-271).
Springer DOI 2412
BibRef

Kwon, M.[Mingi], Oh, S.W.[Seoung Wug], Zhou, Y.[Yang], Liu, D.[Difan], Lee, J.Y.[Joon-Young], Cai, H.R.[Hao-Ran], Liu, B.[Baqiao], Liu, F.[Feng], Uh, Y.J.[Young-Jung],
Harivo: Harnessing Text-to-image Models for Video Generation,
ECCV24(LIII: 19-36).
Springer DOI 2412
BibRef

Lee, S.H.[Seung Hyun], Li, Y.[Yinxiao], Ke, J.J.[Jun-Jie], Yoo, I.[Innfarn], Zhang, H.[Han], Yu, J.[Jiahui], Wang, Q.F.[Qi-Fei], Deng, F.[Fei], Entis, G.[Glenn], He, J.F.[Jun-Feng], Li, G.[Gang], Kim, S.[Sangpil], Essa, I.[Irfan], Yang, F.[Feng],
Parrot: Pareto-optimal Multi-reward Reinforcement Learning Framework for Text-to-image Generation,
ECCV24(XXXVIII: 462-478).
Springer DOI 2412
BibRef

Zheng, A.Y.J.[Amber Yi-Jia], Yeh, R.A.[Raymond A.],
IMMA: Immunizing Text-to-image Models Against Malicious Adaptation,
ECCV24(XXXIX: 458-475).
Springer DOI 2412
BibRef

Chen, J.S.[Jun-Song], Ge, C.J.[Chong-Jian], Xie, E.[Enze], Wu, Y.[Yue], Yao, L.W.[Le-Wei], Ren, X.Z.[Xiao-Zhe], Wang, Z.[Zhongdao], Luo, P.[Ping], Lu, H.C.[Hu-Chuan], Li, Z.G.[Zhen-Guo],
Pixart-sigma: Weak-to-strong Training of Diffusion Transformer for 4k Text-to-image Generation,
ECCV24(XXXII: 74-91).
Springer DOI 2412
BibRef

Chatterjee, A.[Agneet], Ben Melech-Stan, G.[Gabriela], Aflalo, E.[Estelle], Paul, S.[Sayak], Ghosh, D.[Dhruba], Gokhale, T.[Tejas], Schmidt, L.[Ludwig], Hajishirzi, H.[Hannaneh], Lal, V.[Vasudev], Baral, C.[Chitta], Yang, Y.Z.[Ye-Zhou],
Getting it Right: Improving Spatial Consistency in Text-to-image Models,
ECCV24(XXII: 204-222).
Springer DOI 2412
BibRef

Liu, R.T.[Run-Tao], Khakzar, A.[Ashkan], Gu, J.D.[Jin-Dong], Chen, Q.F.[Qi-Feng], Torr, P.H.S.[Philip H.S.], Pizzati, F.[Fabio],
Latent Guard: A Safety Framework for Text-to-image Generation,
ECCV24(XXVI: 93-109).
Springer DOI 2412
BibRef

Wei, F.[Fanyue], Zeng, W.[Wei], Li, Z.Y.[Zhen-Yang], Yin, D.W.[Da-Wei], Duan, L.X.[Li-Xin], Li, W.[Wen],
Powerful and Flexible: Personalized Text-to-image Generation via Reinforcement Learning,
ECCV24(XXVII: 394-410).
Springer DOI 2412
BibRef

Li, H.T.[Han-Ting], Niu, H.J.[Hong-Jing], Zhao, F.[Feng],
Stable Preference: Redefining Training Paradigm of Human Preference Model for Text-to-image Synthesis,
ECCV24(XXVIII: 250-266).
Springer DOI 2412
BibRef

Gal, R.[Rinon], Lichter, O.[Or], Richardson, E.[Elad], Patashnik, O.[Or], Bermano, A.H.[Amit H.], Chechik, G.[Gal], Cohen-Or, D.[Daniel],
LCM-Lookahead for Encoder-based Text-to-image Personalization,
ECCV24(XIV: 322-340).
Springer DOI 2412
BibRef

Dahary, O.[Omer], Patashnik, O.[Or], Aberman, K.[Kfir], Cohen-Or, D.[Daniel],
Be Yourself: Bounded Attention for Multi-subject Text-to-image Generation,
ECCV24(XIV: 432-448).
Springer DOI 2412
BibRef

Xiong, P.X.[Pei-Xi], Kozuch, M.[Michael], Jain, N.[Nilesh],
Textual-visual Logic Challenge: Understanding and Reasoning in Text-to-image Generation,
ECCV24(V: 318-334).
Springer DOI 2412
BibRef

Sun, Y.[Yanan], Liu, Y.C.[Yan-Chen], Tang, Y.[Yinhao], Pei, W.J.[Wen-Jie], Chen, K.[Kai],
Anycontrol: Create Your Artwork with Versatile Control on Text-to-image Generation,
ECCV24(XI: 92-109).
Springer DOI 2412
BibRef

Abdullah, A.[Ahmed], Ebert, N.[Nikolas], Wasenmüller, O.[Oliver],
Boosting Few-shot Detection with Large Language Models and Layout-to-image Synthesis,
ACCV24(VII: 202-219).
Springer DOI 2412
BibRef

Lu, C.Y.[Chen-Yi], Agarwal, S.[Shubham], Tanjim, M.M.[Md Mehrab], Mahadik, K.[Kanak], Rao, A.[Anup], Mitra, S.[Subrata], Saini, S.K.[Shiv Kumar], Bagchi, S.[Saurabh], Chaterji, S.[Somali],
Recon: Training-free Acceleration for Text-to-image Synthesis with Retrieval of Concept Prompt Trajectories,
ECCV24(LIX: 288-306).
Springer DOI 2412
BibRef

Zhao, S.H.[Shi-Hao], Hao, S.[Shaozhe], Zi, B.[Bojia], Xu, H.Z.[Huai-Zhe], Wong, K.Y.K.[Kwan-Yee K.],
Bridging Different Language Models and Generative Vision Models for Text-to-image Generation,
ECCV24(LXXXI: 70-86).
Springer DOI 2412
BibRef

Tan, Z.Y.[Zhi-Yu], Yang, M.[Mengping], Qin, L.[Luozheng], Yang, H.[Hao], Qian, Y.[Ye], Zhou, Q.[Qiang], Zhang, C.[Cheng], Li, H.[Hao],
An Empirical Study and Analysis of Text-to-image Generation Using Large Language Model-powered Textual Representation,
ECCV24(LXXX: 472-489).
Springer DOI 2412
BibRef

Chinchure, A.[Aditya], Shukla, P.[Pushkar], Bhatt, G.[Gaurav], Salij, K.[Kiri], Hosanagar, K.[Kartik], Sigal, L.[Leonid], Turk, M.[Matthew],
Tibet: Identifying and Evaluating Biases in Text-to-image Generative Models,
ECCV24(LXXIX: 429-446).
Springer DOI 2412
BibRef

Mittal, S.[Surbhi], Sudan, A.[Arnav], Vatsa, M.[Mayank], Singh, R.[Richa], Glaser, T.[Tamar], Hassner, T.[Tal],
Navigating Text-to-image Generative Bias Across Indic Languages,
ECCV24(LXXXVIII: 53-67).
Springer DOI 2412
BibRef

Chang, Y.S.[Ying-Shan], Zhang, Y.[Yasi], Fang, Z.Y.[Zhi-Yuan], Wu, Y.N.[Ying Nian], Bisk, Y.[Yonatan], Gao, F.[Feng],
Skews in the Phenomenon Space Hinder Generalization in Text-to-image Generation,
ECCV24(LXXXVII: 422-439).
Springer DOI 2412
BibRef

Yang, Y.Q.[Yu-Qing], Moremada, C.[Charuka], Deligiannis, N.[Nikos],
On the Detection of Images Generated from Text,
ICIP24(3792-3798)
IEEE DOI 2411
Resistance, Visualization, Computational modeling, Perturbation methods, Noise, Text to image, Detectors, robustness BibRef

Liu, Z.X.[Zhi-Xuan], Schaldenbrand, P.[Peter], Okogwu, B.C.[Beverley-Claire], Peng, W.X.[Wen-Xuan], Peng, W.X.[Wen-Xuan], Yun, Y.[Youngsik], Hundt, A.[Andrew], Kim, J.[Jihie], Oh, J.[Jean],
SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation,
CVPR24(10822-10832)
IEEE DOI Code:
WWW Link. 2410
I.e. cultural biases. Measurement, Surveys, Image synthesis, Generative AI, Media, Copyright protection, Data models, Image Synthesis, Computer Vision for Social Good BibRef

Zhang, H.[Hang], Savov, A.[Anton], Dillenburger, B.[Benjamin],
MaskPLAN: Masked Generative Layout Planning from Partial Input,
CVPR24(8964-8973)
IEEE DOI 2410
Measurement, Training, Layout, Transformers, Planning, MAE, Generative, Graph, Layout BibRef

Zhang, Y.X.[Yu-Xuan], Song, Y.[Yiren], Liu, J.[JiaMing], Wang, R.[Rui], Yu, J.P.[Jin-Peng], Tang, H.[Hao], Li, H.X.[Hua-Xia], Tang, X.[Xu], Hu, Y.[Yao], Pan, H.[Han], Jing, Z.L.[Zhong-Liang],
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation,
CVPR24(8069-8078)
IEEE DOI 2410
Code:
WWW Link. Training, Adaptation models, Image coding, Image synthesis, Ecosystems, Feature extraction BibRef

Lee, J.[Jumin], Lee, S.[Sebin], Jo, C.[Changho], Im, W.B.[Woo-Bin], Seon, J.[Juhyeong], Yoon, S.E.[Sung-Eui],
SemCity: Semantic Scene Generation with Triplane Diffusion,
CVPR24(28337-28347)
IEEE DOI Code:
WWW Link. 2410
Roads, Computational modeling, Semantics, Diffusion processes, Diffusion models, diffusion models, scene generation, semantic generation BibRef

Raistrick, A.[Alexander], Mei, L.J.[Ling-Jie], Kayan, K.[Karhan], Yan, D.[David], Zuo, Y.M.[Yi-Ming], Han, B.[Beining], Wen, H.Y.[Hong-Yu], Parakh, M.[Meenal], Alexandropoulos, S.[Stamatis], Lipson, L.[Lahav], Ma, Z.[Zeyu], Deng, J.[Jia],
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation,
CVPR24(21783-21794)
IEEE DOI 2410
Training, Procedural generation, Licenses, Real-time systems, Libraries, Generators, Procedural Generation, Indoor, Dataset, Robotics BibRef

Lin, Z.Q.[Zhi-Qiu], Pathak, D.[Deepak], Li, B.[Baiqi], Li, J.Y.[Jia-Yao], Xia, X.[Xide], Neubig, G.[Graham], Zhang, P.[Pengchuan], Ramanan, D.[Deva],
Evaluating Text-to-visual Generation with Image-to-text Generation,
ECCV24(IX: 366-384).
Springer DOI 2412
BibRef

Li, B.[Baiqi], Lin, Z.Q.[Zhi-Qiu], Pathak, D.[Deepak], Li, J.Y.[Jia-Yao], Fei, Y.X.[Yi-Xin], Wu, K.[Kewen], Xia, X.[Xide], Zhang, P.C.[Peng-Chuan], Neubig, G.[Graham], Ramanan, D.[Deva],
Evaluating and Improving Compositional Text-to-Visual Generation,
GenerativeFM24(5290-5301)
IEEE DOI 2410
Measurement, Visualization, Toxicology, Closed box, Footwear, Cognition BibRef

Ji, P.L.[Peng-Liang], Liu, J.[Junchen],
TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with Generative Foundation Models,
GenerativeFM24(5302-5313)
IEEE DOI 2410
Measurement, Accuracy, Semantics, Benchmark testing, Cognition, text-to-visual evaluation, benchmark evaluation, multimodal, generative foundation models BibRef

Zhao, S.Y.[Shi-Yu], Zhao, L.[Long], Kumar, B.G.V.[B.G. Vijay], Suh, Y.M.[Yu-Min], Metaxas, D.N.[Dimitris N.], Chandraker, M.[Manmohan], Schulter, S.[Samuel],
Generating Enhanced Negatives for Training Language-Based Object Detectors,
CVPR24(13592-13602)
IEEE DOI Code:
WWW Link. 2410
Training, Vocabulary, Accuracy, Training data, Text to image, Detectors, Benchmark testing, open-vocabulary object detection, negative example mining BibRef

Margaryan, H.[Hovhannes], Hayrapetyan, D.[Daniil], Cong, W.[Wenyan], Wang, Z.Y.[Zhang-Yang], Shi, H.[Humphrey],
DGBD: Depth Guided Branched Diffusion for Comprehensive Controllability in Multi-View Generation,
L3D24(747-756)
IEEE DOI 2410
Geometry, Shape, Pipelines, Text to image, Cameras, controllable multi-view generation, diffusion models, controllablity in multi-view generation BibRef

Fan, L.J.[Li-Jie], Chen, K.[Kaifeng], Krishnan, D.[Dilip], Katabi, D.[Dina], Isola, P.[Phillip], Tian, Y.[Yonglong],
Scaling Laws of Synthetic Images for Model Training ... for Now,
CVPR24(7382-7392)
IEEE DOI 2410
Training, Computational modeling, Machine vision, Text to image, Training data, Data models BibRef

Cai, H.[Han], Li, M.[Muyang], Zhang, Q.[Qinsheng], Liu, M.Y.[Ming-Yu], Han, S.[Song],
Condition-Aware Neural Network for Controlled Image Generation,
CVPR24(7194-7203)
IEEE DOI 2410
Image synthesis, Computational modeling, Neural networks, Text to image, Process control, Transformers, efficient deep learning BibRef

Qiao, P.[Pengchong], Shang, L.[Lei], Liu, C.[Chang], Sun, B.[Baigui], Ji, X.Y.[Xiang-Yang], Chen, J.[Jie],
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-Shot Subject-Driven Generation,
CVPR24(7215-7224)
IEEE DOI 2410
Codes, Object oriented modeling, Semantics, Buildings, Text to image, subject-driven generation, derived class BibRef

Wu, R.Q.[Rui-Qi], Chen, L.[Liangyu], Yang, T.[Tong], Guo, C.[Chunle], Li, C.Y.[Chong-Yi], Zhang, X.Y.[Xiang-Yu],
LAMP: Learn A Motion Pattern for Few-Shot Video Generation,
CVPR24(7089-7098)
IEEE DOI Code:
WWW Link. 2410
Training, Computational modeling, Pipelines, Text to image, Diffusion models, Stability analysis, Quality assessment BibRef

Zhu, J.Y.[Jia-Yi], Guo, Q.[Qing], Juefei-Xu, F.[Felix], Huang, Y.H.[Yi-Hao], Liu, Y.[Yang], Pu, G.[Geguang],
Cosalpure: Learning Concept from Group Images for Robust Co-Saliency Detection,
CVPR24(3669-3678)
IEEE DOI Code:
WWW Link. 2410
Technological innovation, Purification, Perturbation methods, Noise, Semantics, Text to image, Object detection BibRef

Chan, K.C.K.[Kelvin C.K.], Zhao, Y.[Yang], Jia, X.[Xuhui], Yang, M.H.[Ming-Hsuan], Wang, H.[Huisheng],
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance,
CVPR24(6733-6742)
IEEE DOI 2410
Training, Codes, Image synthesis, Text to image BibRef

Haji-Ali, M.[Moayed], Balakrishnan, G.[Guha], Ordonez, V.[Vicente],
ElasticDiffusion: Training-Free Arbitrary Size Image Generation Through Global-Local Content Separation,
CVPR24(6603-6612)
IEEE DOI Code:
WWW Link. 2410
Image synthesis, Text to image, Coherence, Diffusion models, Decoding, Trajectory, Text2Image, Diffusion Models, Image Generation, Stablediffusion BibRef

Haydarov, K.[Kilichbek], Muhamed, A.[Aashiq], Shen, X.Q.[Xiao-Qian], Lazarevic, J.[Jovana], Skorokhodov, I.[Ivan], Galappaththige, C.J.[Chamuditha Jayanga], Elhoseiny, M.[Mohamed],
Adversarial Text to Continuous Image Generation,
CVPR24(6316-6326)
IEEE DOI Code:
WWW Link. 2410
Training, Tensors, Image synthesis, Text to image, Modulation, Process control BibRef

Zhang, C.[Cheng], Wu, Q.Y.[Qian-Yi], Gambardella, C.C.[Camilo Cruz], Huang, X.S.[Xiao-Shui], Phung, D.[Dinh], Ouyang, W.L.[Wan-Li], Cai, J.F.[Jian-Fei],
Taming Stable Diffusion for Text to 360° Panorama Image Generation,
CVPR24(6347-6357)
IEEE DOI 2410
Image synthesis, Layout, Noise reduction, Computer architecture, Diffusion models, Distortion BibRef

Tang, J.[Junshu], Zeng, Y.H.[Yan-Hong], Fan, K.[Ke], Wang, X.H.[Xu-Heng], Dai, B.[Bo], Chen, K.[Kai], Ma, L.Z.[Li-Zhuang],
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text,
CVPR24(6243-6253)
IEEE DOI 2410
Training, Geometry, Semantics, Text to image, Production, Texture generation, diffusion model BibRef

Hu, H.X.[He-Xiang], Chan, K.C.K.[Kelvin C.K.], Su, Y.C.[Yu-Chuan], Chen, W.[Wenhu], Li, Y.D.[Yan-Dong], Sohn, K.[Kihyuk], Zhao, Y.[Yang], Ben, X.[Xue], Gong, B.Q.[Bo-Qing], Cohen, W.[William], Chang, M.W.[Ming-Wei], Jia, X.[Xuhui],
Instruct-Imagen: Image Generation with Multi-modal Instruction,
CVPR24(4754-4763)
IEEE DOI 2410
Training, Adaptation models, Image synthesis, Image edge detection, Natural languages, Text to image, Diffusion Model, Generalization to Unseen Tasks BibRef

Kondapaneni, N.[Neehar], Marks, M.[Markus], Knott, M.[Manuel], Guimaraes, R.[Rogerio], Perona, P.[Pietro],
Text-Image Alignment for Diffusion-Based Perception,
CVPR24(13883-13893)
IEEE DOI 2410
Visualization, Codes, Semantic segmentation, Computational modeling, Text to image, Estimation, ADE20K BibRef

Qiao, R.[Runqi], Yang, L.[Lan], Pang, K.Y.[Kai-Yue], Zhang, H.G.[Hong-Gang],
Making Visual Sense of Oracle Bones for You and Me,
CVPR24(12656-12665)
IEEE DOI Code:
WWW Link. 2410
Training, Heart, Visualization, Semantics, Text to image, Manuals, Bones BibRef

Shrestha, R.[Robik], Zou, Y.[Yang], Chen, Q.Y.[Qiu-Yu], Li, Z.H.[Zhi-Heng], Xie, Y.S.[Yu-Sheng], Deng, S.Q.[Si-Qi],
FairRAG: Fair Human Generation via Fair Retrieval Augmentation,
CVPR24(11996-12005)
IEEE DOI 2410
Visualization, Image synthesis, Image databases, Computational modeling, training data, Text to image, bias, fairness, generative-ai BibRef

Jayasumana, S.[Sadeep], Ramalingam, S.[Srikumar], Veit, A.[Andreas], Glasner, D.[Daniel], Chakrabarti, A.[Ayan], Kumar, S.[Sanjiv],
Rethinking FID: Towards a Better Evaluation Metric for Image Generation,
CVPR24(9307-9315)
IEEE DOI Code:
WWW Link. 2410
Measurement, Machine learning algorithms, Image synthesis, Text to image, Machine learning, Probability distribution, CMMD BibRef

Wu, Y.[You], Liu, K.[Kean], Mi, X.Y.[Xiao-Yue], Tang, F.[Fan], Cao, J.[Juan], Li, J.T.[Jin-Tao],
U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation,
CVPR24(9482-9491)
IEEE DOI Code:
WWW Link. 2410
Visualization, Semantics, Refining, Text to image, Aerospace electronics, Controllability BibRef

Ding, G.G.[Gang-Gui], Zhao, C.[Canyu], Wang, W.[Wen], Yang, Z.[Zhen], Liu, Z.[Zide], Chen, H.[Hao], Shen, C.H.[Chun-Hua],
FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition,
CVPR24(9089-9098)
IEEE DOI 2410
Training, Codes, Image synthesis, Text to image, Faces, image customization, diffusion model, generative model BibRef

Chen, R.D.[Rui-Dong], Wang, L.[Lanjun], Nie, W.Z.[Wei-Zhi], Zhang, Y.D.[Yong-Dong], Liu, A.A.[An-An],
AnyScene: Customized Image Synthesis with Composited Foreground,
CVPR24(8724-8733)
IEEE DOI 2410
Measurement, Visualization, Image synthesis, Semantics, Layout, Text to image, text to image generation, generative model BibRef

Yang, S.[Shuai], Zhou, Y.F.[Yi-Fan], Liu, Z.W.[Zi-Wei], Loy, C.C.[Chen Change],
Fresco: Spatial-Temporal Correspondence for Zero-Shot Video Translation,
CVPR24(8703-8712)
IEEE DOI 2410
Training, Visualization, Attention mechanisms, Superresolution, Text to image, Coherence, diffusion, video-to-video translation, intra-frame consistency BibRef

Po, R.[Ryan], Yang, G.[Guandao], Aberman, K.[Kfir], Wetzstein, G.[Gordon],
Orthogonal Adaptation for Modular Customization of Diffusion Models,
CVPR24(7964-7973)
IEEE DOI 2410
Adaptation models, Computational modeling, Scalability, Merging, Text to image, Interference BibRef

Bahmani, S.[Sherwin], Skorokhodov, I.[Ivan], Rong, V.[Victor], Wetzstein, G.[Gordon], Guibas, L.J.[Leonidas J.], Wonka, P.[Peter], Tulyakov, S.[Sergey], Park, J.J.[Jeong Joon], Tagliasacchi, A.[Andrea], Lindell, D.B.[David B.],
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling,
CVPR24(7996-8006)
IEEE DOI 2410
Measurement, Training, Solid modeling, Dynamics, Text to image, Hybrid power systems BibRef

Horita, D.[Daichi], Inoue, N.[Naoto], Kikuchi, K.[Kotaro], Yamaguchi, K.[Kota], Aizawa, K.[Kiyoharu],
Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation,
CVPR24(67-76)
IEEE DOI 2410
Visualization, Layout, Training data, Computer architecture, Transformers, Generators, layout genration, content-aware layout generation BibRef

Zhang, S.[Sixian], Wang, B.[Bohan], Wu, J.Q.[Jun-Qiang], Li, Y.[Yan], Gao, T.T.[Ting-Ting], Zhang, D.[Di], Wang, Z.Y.[Zhong-Yuan],
Learning Multi-Dimensional Human Preference for Text-to-Image Generation,
CVPR24(8018-8027)
IEEE DOI 2410
Measurement, Image synthesis, Annotations, Computational modeling, Semantics, Text to image, Text-to-image generation, Evaluation BibRef

Zhang, Y.M.[Yi-Ming], Xing, Z.[Zhening], Zeng, Y.H.[Yan-Hong], Fang, Y.Q.[You-Qing], Chen, K.[Kai],
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models,
CVPR24(7747-7756)
IEEE DOI 2410
Text to image, Benchmark testing, Animation, Controllability, Tuning BibRef

Huang, S.[Siteng], Gong, B.[Biao], Feng, Y.T.[Yu-Tong], Chen, X.[Xi], Fu, Y.Q.[Yu-Qian], Liu, Y.[Yu], Wang, D.L.[Dong-Lin],
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation,
CVPR24(7797-7806)
IEEE DOI Code:
WWW Link. 2410
Animals, Semantics, Text to image, Feature extraction, Contamination, text-to-image generation, Action-Disentangled Identifier BibRef

Chen, Z.J.[Zi-Jie], Zhang, L.C.[Li-Chao], Weng, F.S.[Fang-Sheng], Pan, L.[Lili], Lan, Z.Z.[Zhen-Zhong],
Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting,
CVPR24(7727-7736)
IEEE DOI Code:
WWW Link. 2410
Visualization, Codes, Text to image BibRef

Qu, L.G.[Lei-Gang], Wang, W.J.[Wen-Jie], Li, Y.Q.[Yong-Qi], Zhang, H.W.[Han-Wang], Nie, L.Q.[Li-Qiang], Chua, T.S.[Tat-Seng],
Discriminative Probing and Tuning for Text-to-Image Generation,
CVPR24(7434-7444)
IEEE DOI Code:
WWW Link. 2410
Adaptation models, Large language models, Face recognition, Computational modeling, Layout, Text to image BibRef

Cheng, T.Y.[Ta-Ying], Gadelha, M.[Matheus], Groueix, T.[Thibault], Fisher, M.[Matthew], Mech, R.[Radomír], Markham, A.[Andrew], Trigoni, N.[Niki],
Learning Continuous 3D Words for Text-to-Image Generation,
CVPR24(6753-6762)
IEEE DOI Code:
WWW Link. 2410
Image recognition, Image synthesis, Text recognition, Shape, Text to image, Lighting BibRef

Ruiz, N.[Nataniel], Li, Y.Z.[Yuan-Zhen], Jampani, V.[Varun], Wei, W.[Wei], Hou, T.B.[Ting-Bo], Pritch, Y.[Yael], Wadhwa, N.[Neal], Rubinstein, M.[Michael], Aberman, K.[Kfir],
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models,
CVPR24(6527-6536)
IEEE DOI 2410
Generative AI, Face recognition, Semantics, Memory management, Text to image, Graphics processing units, diffusion models, subject driven personalization BibRef

Zhang, Y.B.[Yan-Bing], Yang, M.[Mengping], Zhou, Q.[Qin], Wang, Z.[Zhe],
Attention Calibration for Disentangled Text-to-Image Personalization,
CVPR24(4764-4774)
IEEE DOI 2410
Visualization, Solid modeling, Image synthesis, Pipelines, Text to image, Text-to-image, Personalization, Attention Calibration BibRef

Burgert, R.D.[Ryan D.], Price, B.L.[Brian L.], Kuen, J.[Jason], Li, Y.J.[Yi-Jun], Ryoo, M.S.[Michael S.],
MAGICK: A Large-Scale Captioned Dataset from Matting Generated Images Using Chroma Keying,
CVPR24(22595-22604)
IEEE DOI Code:
WWW Link. 2410
Training, Hair, Image segmentation, Accuracy, Image synthesis, Text to image, alpha, matting, dataset, generation, text, image, compositing BibRef

Dao, T.T.[Trung Tuan], Vu, D.H.[Duc Hong], Pham, C.[Cuong], Tran, A.[Anh],
EFHQ: Multi-Purpose ExtremePose-Face-HQ Dataset,
CVPR24(22605-22615)
IEEE DOI 2410
Training, Deep learning, Face recognition, Pipelines, Text to image, Benchmark testing BibRef

Cazenavette, G.[George], Sud, A.[Avneesh], Leung, T.[Thomas], Usman, B.[Ben],
FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion,
CVPR24(10759-10769)
IEEE DOI 2410
Training, Visualization, Protocols, Text to image, Detectors, Benchmark testing, Feature extraction, diffusion, fake detection BibRef

Jayasumana, S.[Sadeep], Glasner, D.[Daniel], Ramalingam, S.[Srikumar], Veit, A.[Andreas], Chakrabarti, A.[Ayan], Kumar, S.[Sanjiv],
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation,
CVPR24(9316-9325)
IEEE DOI 2410
Training, Image quality, Adaptation models, Image synthesis, Computational modeling, Text to image, Predictive models, Image generation BibRef

Ohanyan, M.[Marianna], Manukyan, H.[Hayk], Wang, Z.Y.[Zhang-Yang], Navasardyan, S.[Shant], Shi, H.[Humphrey],
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis,
CVPR24(8764-8774)
IEEE DOI 2410
Shape, Layout, Text to image BibRef

Shi, J.[Jing], Xiong, W.[Wei], Lin, Z.[Zhe], Jung, H.J.[Hyun Joon],
InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning,
CVPR24(8543-8552)
IEEE DOI Code:
WWW Link. 2410
Image quality, Adaptation models, Technological innovation, Image synthesis, Scalability, Text to image, image generation BibRef

Liang, Y.[Youwei], He, J.F.[Jun-Feng], Li, G.[Gang], Li, P.Z.[Pei-Zhao], Klimovskiy, A.[Arseniy], Carolan, N.[Nicholas], Sun, J.[Jiao], Pont-Tuset, J.[Jordi], Young, S.[Sarah], Yang, F.[Feng], Ke, J.J.[Jun-Jie], Dvijotham, K.D.[Krishnamurthy Dj], Collins, K.M.[Katherine M.], Luo, Y.W.[Yi-Wen], Li, Y.[Yang], Kohlhoff, K.J.[Kai J], Ramachandran, D.[Deepak], Navalpakkam, V.[Vidhya],
Rich Human Feedback for Text-to-Image Generation,
CVPR24(19401-19411)
IEEE DOI Code:
WWW Link. 2410
Image synthesis, Large language models, Text to image, Training data, Reinforcement learning, Predictive models, rich human feedback BibRef

Li, X.[Xiang], Shen, Q.L.[Qian-Li], Kawaguchi, K.[Kenji],
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models,
CVPR24(12363-12373)
IEEE DOI Code:
WWW Link. 2410
Codes, Text to image, Closed box, Copyright protection, Probabilistic logic, copyright protection, text-to-image BibRef

d'Incà, M.[Moreno], Peruzzo, E.[Elia], Mancini, M.[Massimiliano], Xu, D.[Dejia], Goe, V.[Vidit], Xu, X.Q.[Xing-Qian], Wang, Z.Y.[Zhang-Yang], Shi, H.[Humphrey], Sebe, N.[Nicu],
OpenBias: Open-Set Bias Detection in Text-to-Image Generative Models,
CVPR24(12225-12235)
IEEE DOI 2410
Limiting, Prevention and mitigation, Large language models, Pipelines, Knowledge based systems, Text to image, Generative AI, Text-to-Image BibRef

Le Coz, A.[Adrien], Ouertatani, H.[Houssem], Herbin, S.[Stéphane], Adjed, F.[Faouzi],
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models,
GCV24(7569-7578)
IEEE DOI 2410
Training, Costs, Image synthesis, Computational modeling, Text to image, Benchmark testing, image classifier failures, bayesian optimization BibRef

Wang, Y.L.[Yi-Lin], Xu, H.Y.[Hai-Yang], Zhang, X.[Xiang], Chen, Z.[Zeyuan], Sha, Z.Z.[Zhi-Zhou], Wang, Z.[Zirui], Tu, Z.W.[Zhuo-Wen],
OmniControlNet: Dual-stage Integration for Conditional Image Generation,
GCV24(7436-7448)
IEEE DOI 2410
Image synthesis, Image edge detection, Redundancy, Pipelines, Text to image, Process control, Predictive models, Generative Models BibRef

Zhao, Y.Q.[Yi-Qun], Zhao, Z.[Zibo], Li, J.[Jing], Dong, S.[Sixun], Gao, S.H.[Sheng-Hua],
RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation,
3DV24(1413-1423)
IEEE DOI 2408
Geometry, Shape, Vector quantization, Layout, Predictive models, Transformers, 3D Scene Generation BibRef

Ganz, R.[Roy], Elad, M.[Michael],
CLIPAG: Towards Generator-Free Text-to-Image Generation,
WACV24(3831-3841)
IEEE DOI 2404
Computational modeling, Semantics, Computer architecture, Generators, Task analysis, Image classification, Algorithms, Vision + language and/or other modalities BibRef

Park, S.[Seongbeom], Moon, S.H.[Su-Hong], Park, S.H.[Seung-Hyun], Kim, J.[Jinkyu],
Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation,
WACV24(4663-4672)
IEEE DOI 2404
Location awareness, Ethics, Visualization, Analytical models, Image recognition, Computational modeling, Algorithms, Explainable, Vision + language and/or other modalities BibRef

Jeanneret, G.[Guillaume], Simon, L.[Loïc], Jurie, F.[Frédéric],
Text-to-Image Models for Counterfactual Explanations: A Black-Box Approach,
WACV24(4745-4755)
IEEE DOI 2404
Analytical models, Codes, Computational modeling, Closed box, Computer architecture, Algorithms, Explainable, fair, accountable, Vision + language and/or other modalities BibRef

Grimal, P.[Paul], Borgne, H.L.[Hervé Le], Ferret, O.[Olivier], Tourille, J.[Julien],
TIAM - A Metric for Evaluating Alignment in Text-to-Image Generation,
WACV24(2878-2887)
IEEE DOI 2404
Measurement, Image quality, Image color analysis, Rendering (computer graphics), Colored noise, Algorithms, Vision + language and/or other modalities BibRef

Qin, C.[Can], Yu, N.[Ning], Xing, C.[Chen], Zhang, S.[Shu], Chen, Z.Y.[Ze-Yuan], Ermon, S.[Stefano], Fu, Y.[Yun], Xiong, C.M.[Cai-Ming], Xu, R.[Ran],
GlueGen: Plug and Play Multi-Modal Encoders for X-to-Image Generation,
ICCV23(23028-23039)
IEEE DOI 2401
BibRef

Bahmani, S.[Sherwin], Park, J.J.[Jeong Joon], Paschalidou, D.[Despoina], Yan, X.G.[Xing-Guang], Wetzstein, G.[Gordon], Guibas, L.J.[Leonidas J.], Tagliasacchi, A.[Andrea],
CC3D: Layout-Conditioned Generation of Compositional 3D Scenes,
ICCV23(7137-7147)
IEEE DOI 2401
BibRef

Lee, T.[Taegyeong], Kang, J.[Jeonghun], Kim, H.[Hyeonyu], Kim, T.[Taehwan],
Generating Realistic Images from In-the-wild Sounds,
ICCV23(7126-7136)
IEEE DOI 2401
BibRef

Ye-Bin, M.[Moon], Kim, J.[Jisoo], Kim, H.Y.[Hong-Yeob], Son, K.[Kilho], Oh, T.H.[Tae-Hyun],
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation,
ICCV23(2526-2537)
IEEE DOI 2401
BibRef

Ma, Y.W.[Yi-Wei], Wang, H.[Haowei], Zhang, X.Q.[Xiao-Qing], Jiang, G.[Guannan], Sun, X.S.[Xiao-Shuai], Zhuang, W.L.[Wei-Lin], Ji, J.Y.[Jia-Yi], Ji, R.R.[Rong-Rong],
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance,
ICCV23(2737-2748)
IEEE DOI Code:
WWW Link. 2401
BibRef

Lin, J.W.[Jia-Wei], Guo, J.Q.[Jia-Qi], Sun, S.Z.[Shi-Zhao], Xu, W.J.[Wei-Jiang], Liu, T.[Ting], Lou, J.G.[Jian-Guang], Zhang, D.M.[Dong-Mei],
A Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions,
ICCV23(23565-23574)
IEEE DOI 2401
BibRef

Liu, N.[Nan], Du, Y.L.[Yi-Lun], Li, S.[Shuang], Tenenbaum, J.B.[Joshua B.], Torralba, A.[Antonio],
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models,
ICCV23(2085-2095)
IEEE DOI 2401
BibRef

Wu, X.S.[Xiao-Shi], Sun, K.Q.[Ke-Qiang], Zhu, F.[Feng], Zhao, R.[Rui], Li, H.S.[Hong-Sheng],
Human Preference Score: Better Aligning Text-to-image Models with Human Preference,
ICCV23(2096-2105)
IEEE DOI Code:
WWW Link. 2401
BibRef

Le, T.V.[Thanh Van], Phung, H.[Hao], Nguyen, T.H.[Thuan Hoang], Dao, Q.[Quan], Tran, N.N.[Ngoc N.], Tran, A.[Anh],
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis,
ICCV23(2116-2127)
IEEE DOI Code:
WWW Link. 2401
BibRef

Agarwal, A.[Aishwarya], Karanam, S.[Srikrishna], Joseph, K.J., Saxena, A.[Apoorv], Goswami, K.[Koustava], Srinivasan, B.V.[Balaji Vasan],
A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis,
ICCV23(2283-2293)
IEEE DOI 2401
BibRef

Cho, J.[Jaemin], Zala, A.[Abhay], Bansal, M.[Mohit],
DALL-EVAL: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models,
ICCV23(3020-3031)
IEEE DOI 2401
BibRef

Zhang, C.[Cheng], Chen, X.[Xuanbai], Chai, S.Q.[Si-Qi], Wu, C.H.[Chen Henry], Lagun, D.[Dmitry], Beeler, T.[Thabo], de la Torre, F.[Fernando],
ITI-Gen: Inclusive Text-to-Image Generation,
ICCV23(3946-3957)
IEEE DOI 2401
BibRef

Struppek, L.[Lukas], Hintersdorf, D.[Dominik], Kersting, K.[Kristian],
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis,
ICCV23(4561-4573)
IEEE DOI Code:
WWW Link. 2401
BibRef

Basu, A.[Abhipsa], Babu, R.V.[R. Venkatesh], Pruthi, D.[Danish],
Inspecting the Geographical Representativeness of Images from Text-to-Image Models,
ICCV23(5113-5124)
IEEE DOI 2401
BibRef

Wang, S.Y.[Sheng-Yu], Efros, A.A.[Alexei A.], Zhu, J.Y.[Jun-Yan], Zhang, R.[Richard],
Evaluating Data Attribution for Text-to-Image Models,
ICCV23(7158-7169)
IEEE DOI 2401
BibRef

Park, M.H.[Min-Ho], Yun, J.[Jooyeol], Choi, S.[Seunghwan], Choo, J.[Jaegul],
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis,
ICCV23(7557-7566)
IEEE DOI Code:
WWW Link. 2401
BibRef

Höllein, L.[Lukas], Cao, A.[Ang], Owens, A.[Andrew], Johnson, J.[Justin], Nießner, M.[Matthias],
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models,
ICCV23(7875-7886)
IEEE DOI 2401
BibRef

Wei, Y.X.[Yu-Xiang], Zhang, Y.[Yabo], Ji, Z.L.[Zhi-Long], Bai, J.F.[Jin-Feng], Zhang, L.[Lei], Zuo, W.M.[Wang-Meng],
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation,
ICCV23(15897-15907)
IEEE DOI Code:
WWW Link. 2401
BibRef

Bakr, E.M.[Eslam Mohamed], Sun, P.Z.[Peng-Zhan], Shen, X.Q.[Xiao-Qian], Khan, F.F.[Faizan Farooq], Li, L.E.[Li Erran], Elhoseiny, M.[Mohamed],
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models,
ICCV23(19984-19996)
IEEE DOI Code:
WWW Link. 2401
BibRef

Lee, J.[Jaewoong], Jang, S.[Sangwon], Jo, J.[Jaehyeong], Yoon, J.[Jaehong], Kim, Y.J.[Yun-Ji], Kim, J.H.[Jin-Hwa], Ha, J.W.[Jung-Woo], Hwang, S.J.[Sung Ju],
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,
ICCV23(23195-23205)
IEEE DOI 2401
BibRef

Hou, X.[Xia], Sun, M.[Meng], Song, W.F.[Wen-Feng],
Tell Your Story: Text-Driven Face Video Synthesis with High Diversity via Adversarial Learning,
ICIP23(515-519)
IEEE DOI Code:
WWW Link. 2312
BibRef

Zhang, Z.Q.[Zhi-Qiang], Xu, J.Y.[Jia-Yao], Morita, R.[Ryugo], Yu, W.X.[Wen-Xin], Zhou, J.J.[Jin-Jia],
Dynamic Unilateral Dual Learning for Text to Image Synthesis,
ICIP23(1130-1134)
IEEE DOI 2312
BibRef

Mao, J.F.[Jia-Feng], Wang, X.T.[Xue-Ting],
Training-Free Location-Aware Text-to-Image Synthesis,
ICIP23(995-999)
IEEE DOI 2312
BibRef

Chen, W.J.[Wen-Jie], Ni, Z.K.[Zhang-Kai], Wang, H.L.[Han-Li],
Structure-Aware Generative Adversarial Network for Text-to-Image Generation,
ICIP23(2075-2079)
IEEE DOI 2312
BibRef

Morita, R.[Ryugo], Zhang, Z.Q.[Zhi-Qiang], Zhou, J.J.[Jin-Jia],
BATINeT: Background-Aware Text to Image Synthesis and Manipulation Network,
ICIP23(765-769)
IEEE DOI 2312
BibRef

Yang, S.S.[Shu-Sheng], Ge, Y.X.[Yi-Xiao], Yi, K.[Kun], Li, D.[Dian], Shan, Y.[Ying], Qie, X.[Xiaohu], Wang, X.G.[Xing-Gang],
RILS: Masked Visual Reconstruction in Language Semantic Space,
CVPR23(23304-23314)
IEEE DOI 2309
BibRef

Wei, J.C.[Jia-Cheng], Wang, H.[Hao], Feng, J.S.[Jia-Shi], Lin, G.S.[Guo-Sheng], Yap, K.H.[Kim-Hui],
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision,
CVPR23(16805-16815)
IEEE DOI 2309
BibRef

Zeng, Y.[Yu], Lin, Z.[Zhe], Zhang, J.M.[Jian-Ming], Liu, Q.[Qing], Collomosse, J.[John], Kuen, J.[Jason], Patel, V.M.[Vishal M.],
SceneComposer: Any-Level Semantic Image Synthesis,
CVPR23(22468-22478)
IEEE DOI 2309
BibRef

Lin, J.[Junfan], Chang, J.L.[Jian-Long], Liu, L.B.[Ling-Bo], Li, G.B.[Guan-Bin], Lin, L.[Liang], Tian, Q.[Qi], Chen, C.W.[Chang Wen],
Being Comes from Not-Being: Open-Vocabulary Text-to-Motion Generation with Wordless Training,
CVPR23(23222-23231)
IEEE DOI 2309
BibRef

Yang, Z.Y.[Zheng-Yuan], Wang, J.F.[Jian-Feng], Gan, Z.[Zhe], Li, L.J.[Lin-Jie], Lin, K.[Kevin], Wu, C.[Chenfei], Duan, N.[Nan], Liu, Z.C.[Zi-Cheng], Liu, C.[Ce], Zeng, M.[Michael], Wang, L.J.[Li-Juan],
ReCo: Region-Controlled Text-to-Image Generation,
CVPR23(14246-14255)
IEEE DOI 2309
BibRef

Otani, M.[Mayu], Togashi, R.[Riku], Sawai, Y.[Yu], Ishigami, R.[Ryosuke], Nakashima, Y.[Yuta], Rahtu, E.[Esa], Heikkilä, J.[Janne], Satoh, S.[Shin'ichi],
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation,
CVPR23(14277-14286)
IEEE DOI 2309
BibRef

Liu, H.[Han], Wu, Y.H.[Yu-Hao], Zhai, S.[Shixuan], Yuan, B.[Bo], Zhang, N.[Ning],
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation with Natural Prompts,
CVPR23(20585-20594)
IEEE DOI 2309
BibRef

Kang, M.[Minguk], Zhu, J.Y.[Jun-Yan], Zhang, R.[Richard], Park, J.[Jaesik], Shechtman, E.[Eli], Paris, S.[Sylvain], Park, T.[Taesung],
Scaling up GANs for Text-to-Image Synthesis,
CVPR23(10124-10134)
IEEE DOI 2309
BibRef

Careil, M.[Marlène], Verbeek, J.[Jakob], Lathuilière, S.[Stéphane],
Few-shot Semantic Image Synthesis with Class Affinity Transfer,
CVPR23(23611-23620)
IEEE DOI 2309
BibRef

Kang, M.S.[Min-Soo], Lee, D.[Doyup], Kim, J.[Jiseob], Kim, S.[Saehoon], Han, B.H.[Bo-Hyung],
Variational Distribution Learning for Unsupervised Text-to-Image Generation,
CVPR23(23380-23389)
IEEE DOI 2309
BibRef

Sung-Bin, K.[Kim], Senocak, A.[Arda], Ha, H.W.[Hyun-Woo], Owens, A.[Andrew], Oh, T.H.[Tae-Hyun],
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment,
CVPR23(6430-6440)
IEEE DOI 2309
BibRef

Cong, Y.[Yuren], Yi, J.H.[Jin-Hui], Rosenhahn, B.[Bodo], Yang, M.Y.[Michael Ying],
SSGVS: Semantic Scene Graph-to-Video Synthesis,
MULA23(2555-2565)
IEEE DOI 2309
BibRef

Zhang, S.X.[Si-Xian], Song, X.H.[Xin-Hang], Li, W.J.[Wei-Jie], Bai, Y.B.[Yu-Bing], Yu, X.Y.[Xin-Yao], Jiang, S.Q.[Shu-Qiang],
Layout-based Causal Inference for Object Navigation,
CVPR23(10792-10802)
IEEE DOI 2309
BibRef

Hsu, H.Y.[Hsiao-Yuan], He, X.T.[Xiang-Teng], Peng, Y.X.[Yu-Xin], Kong, H.[Hao], Zhang, Q.[Qing],
PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout,
CVPR23(6018-6026)
IEEE DOI 2309
BibRef

Xue, H.[Han], Huang, Z.W.[Zhi-Wu], Sun, Q.[Qianru], Song, L.[Li], Zhang, W.J.[Wen-Jun],
Freestyle Layout-to-Image Synthesis,
CVPR23(14256-14266)
IEEE DOI 2309
BibRef

Jiang, Z.Y.[Zhao-Yun], Guo, J.Q.[Jia-Qi], Sun, S.Z.[Shi-Zhao], Deng, H.Y.[Hua-Yu], Wu, Z.K.[Zhong-Kai], Mijovic, V.[Vuksan], Yang, Z.J.J.[Zi-Jiang James], Lou, J.G.[Jian-Guang], Zhang, D.M.[Dong-Mei],
LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction,
CVPR23(18403-18412)
IEEE DOI 2309
BibRef

Akula, A.R.[Arjun R.], Driscoll, B.[Brendan], Narayana, P.[Pradyumna], Changpinyo, S.[Soravit], Jia, Z.W.[Zhi-Wei], Damle, S.[Suyash], Pruthi, G.[Garima], Basu, S.[Sugato], Guibas, L.J.[Leonidas J.], Freeman, W.T.[William T.], Li, Y.Z.[Yuan-Zhen], Jampani, V.[Varun],
MetaCLUE: Towards Comprehensive Visual Metaphors Research,
CVPR23(23201-23211)
IEEE DOI 2309
BibRef

Hwang, I.[Inwoo], Kim, H.[Hyeonwoo], Kim, Y.M.[Young Min],
Text2Scene: Text-driven Indoor Scene Stylization with Part-Aware Details,
CVPR23(1890-1899)
IEEE DOI 2309
BibRef

Li, Y.H.[Yu-Heng], Liu, H.T.[Hao-Tian], Wu, Q.Y.[Qing-Yang], Mu, F.Z.[Fang-Zhou], Yang, J.W.[Jian-Wei], Gao, J.F.[Jian-Feng], Li, C.Y.[Chun-Yuan], Lee, Y.J.[Yong Jae],
GLIGEN: Open-Set Grounded Text-to-Image Generation,
CVPR23(22511-22521)
IEEE DOI 2309
BibRef

Lai, B.[Borun], Ma, L.H.[Li-Hong], Tian, J.[Jing],
Gated Cross Word-visual Attention-driven Generative Adversarial Networks for Text-to-image Synthesis,
ACCV22(VII:88-100).
Springer DOI 2307
BibRef

Wang, Z.W.[Zhi-Wei], Yang, J.[Jing], Cui, J.J.[Jia-Jun], Liu, J.W.[Jia-Wei], Wang, J.H.[Jia-Hao],
DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for Text-to-Image Generation,
ACCV22(VII:3-19).
Springer DOI 2307
BibRef

Liang, M.L.[Ming-Liang], Liu, Z.R.[Zhuo-Ran], Larson, M.[Martha],
Textual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching,
MMMod23(I: 421-433).
Springer DOI 2304
Text as input, output concepts BibRef

Loeschcke, S.[Sebastian], Belongie, S.[Serge], Benaim, S.[Sagie],
Text-driven Stylization of Video Objects,
CVEU22(594-609).
Springer DOI 2304
BibRef

Zhou, L.L.[Long-Long], Wu, X.J.[Xiao-Jun], Xu, T.Y.[Tian-Yang],
COMIM-GAN: Improved Text-to-Image Generation via Condition Optimization and Mutual Information Maximization,
MMMod23(I: 385-396).
Springer DOI 2304
BibRef

Lee, H.[Hanbit], Kim, Y.[Youna], Lee, S.G.[Sang-Goo],
Multi-scale Contrastive Learning for Complex Scene Generation,
WACV23(764-774)
IEEE DOI 2302
Semantics, Generative adversarial networks, Generators, Data models, Task analysis, image and video synthesis BibRef

Kim, J.Y.[Jih-Yun], Jeong, S.H.[Seong-Hun], Kong, K.[Kyeongbo], Kang, S.J.[Suk-Ju],
An Unified Framework for Language Guided Image Completion,
WACV23(2567-2577)
IEEE DOI 2302
Training, Visualization, Image synthesis, Computational modeling, Natural languages, Complexity theory, Vision + language and/or other modalities BibRef

Liao, W.T.[Wen-Tong], Hu, K.[Kai], Yang, M.Y.[Michael Ying], Rosenhahn, B.[Bodo],
Text to Image Generation with Semantic-Spatial Aware GAN,
CVPR22(18166-18175)
IEEE DOI 2210
Visualization, Image recognition, Image synthesis, Fuses, Computational modeling, Semantics, Vision+language BibRef

He, S.[Sen], Liao, W.T.[Wen-Tong], Yang, M.Y.[Michael Ying], Yang, Y.X.[Yong-Xin], Song, Y.Z.[Yi-Zhe], Rosenhahn, B.[Bodo], Xiang, T.[Tao],
Context-Aware Layout to Image Generation with Enhanced Object Appearance,
CVPR21(15044-15053)
IEEE DOI 2111
Visualization, Image synthesis, Computational modeling, Layout, Benchmark testing, Inspection, Generators BibRef

Wang, Z.K.[Ze-Kang], Liu, L.[Li], Zhang, H.X.[Hua-Xiang], Ma, Y.[Yue], Cui, H.L.[Huai-Lei], Chen, Y.[Yuan], Kong, H.R.[Hao-Ran],
Generative Adversarial Networks Based on Dynamic Word-Level Update for Text-to-Image Synthesis,
ICIVC22(641-647)
IEEE DOI 2301
Training, Image synthesis, Semantics, Benchmark testing, Generative adversarial networks, Visual effects, Generators, hierarchical image generation BibRef

Li, H.[Hui], Yuan, X.C.[Xu-Chang],
Image Generation Method of Bird Text Based on Improved StackGAN,
ICIVC22(805-811)
IEEE DOI 2301
Training, Image synthesis, Convolution, Computational modeling, Semantics, Birds, Cultural differences, Text to image, StackGAN, Residual structure BibRef

Liu, X.[Xian], Xu, Y.H.[Ying-Hao], Wu, Q.Y.[Qian-Yi], Zhou, H.[Hang], Wu, W.[Wayne], Zhou, B.[Bolei],
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation,
ECCV22(XXXVII:106-125).
Springer DOI 2211
BibRef

Li, B.[Bowen],
Word-Level Fine-Grained Story Visualization,
ECCV22(XXXVI:347-362).
Springer DOI 2211
BibRef

Tan, R.[Reuben], Plummer, B.A.[Bryan A.], Saenko, K.[Kate], Lewis, J.P., Sud, A.[Avneesh], Leung, T.[Thomas],
NewsStories: Illustrating Articles with Visual Summaries,
ECCV22(XXXVI:644-661).
Springer DOI 2211
BibRef

Roy, P.[Prasun], Ghosh, S.[Subhankar], Bhattacharya, S.[Saumik], Pal, U.[Umapada], Blumenstein, M.[Michael],
TIPS: Text-Induced Pose Synthesis,
ECCV22(XXXVIII:161-178).
Springer DOI 2211
BibRef

Shi, Z.F.[Zi-Fan], Shen, Y.J.[Yu-Jun], Zhu, J.P.[Jia-Peng], Yeung, D.Y.[Dit-Yan], Chen, Q.F.[Qi-Feng],
3D-Aware Indoor Scene Synthesis with Depth Priors,
ECCV22(XVI:406-422).
Springer DOI 2211
BibRef

Lee, S.H.[Seung Hyun], Oh, G.[Gyeongrok], Byeon, W.[Wonmin], Kim, C.[Chanyoung], Ryoo, W.J.[Won Jeong], Yoon, S.H.[Sang Ho], Cho, H.[Hyunjun], Bae, J.Y.[Jih-Yun], Kim, J.[Jinkyu], Kim, S.[Sangpil],
Sound-Guided Semantic Video Generation,
ECCV22(XVII:34-50).
Springer DOI 2211
BibRef

Yan, K.[Kun], Ji, L.[Lei], Wu, C.F.[Chen-Fei], Bao, J.M.[Jian-Min], Zhou, M.[Ming], Duan, N.[Nan], Ma, S.[Shuai],
Trace Controlled Text to Image Generation,
ECCV22(XXXVI:59-75).
Springer DOI 2211
BibRef

Dinh, T.M.[Tan M.], Nguyen, R.[Rang], Hua, B.S.[Binh-Son],
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation,
ECCV22(XXXVI:594-609).
Springer DOI 2211
BibRef

Zhang, J.H.[Jia-Hui], Zhan, F.N.[Fang-Neng], Theobalt, C.[Christian], Lu, S.J.[Shi-Jian],
Regularized Vector Quantization for Tokenized Image Synthesis,
CVPR23(18467-18476)
IEEE DOI 2309
BibRef

Zhan, F.N.[Fang-Neng], Zhang, J.H.[Jia-Hui], Yu, Y.C.[Ying-Chen], Wu, R.L.[Rong-Liang], Lu, S.J.[Shi-Jian],
Modulated Contrast for Versatile Image Synthesis,
CVPR22(18259-18269)
IEEE DOI 2210
Photography, Visualization, Codes, Image synthesis, Force, Performance gain, Image and video synthesis and generation, Computational photography BibRef

Qiao, X.T.[Xiao-Tian], Hancke, G.P.[Gerhard P.], Lau, R.W.H.[Rynson W.H.],
Learning Object Context for Novel-view Scene Layout Generation,
CVPR22(16969-16978)
IEEE DOI 2210
Computational modeling, Layout, Semantics, Predictive models, Cameras, Probabilistic logic, Scene analysis and understanding, Image and video synthesis and generation BibRef

Ntavelis, E.[Evangelos], Shahbazi, M.[Mohamad], Kastanis, I.[Iason], Timofte, R.[Radu], Danelljan, M.[Martin], Van Gool, L.J.[Luc J.],
Arbitrary-Scale Image Synthesis,
CVPR22(11523-11532)
IEEE DOI 2210
Training, Image coding, Image synthesis, Pipelines, Generative adversarial networks, Encoding, Image and video synthesis and generation BibRef

Georgopoulos, M.[Markos], Oldfield, J.[James], Chrysos, G.G.[Grigorios G.], Panagakis, Y.[Yannis],
Cluster-guided Image Synthesis with Unconditional Models,
CVPR22(11533-11542)
IEEE DOI 2210
Hair, Maximum likelihood estimation, Image synthesis, Semantics, Process control, Generative adversarial networks, Generators, Explainable computer vision BibRef

Wei, Y.X.[Yu-Xiang], Ji, Z.L.[Zhi-Long], Wu, X.H.[Xiao-He], Bai, J.F.[Jin-Feng], Zhang, L.[Lei], Zuo, W.M.[Wang-Meng],
Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis,
CVPR23(11248-11258)
IEEE DOI 2309
BibRef

Lv, Z.Y.[Zheng-Yao], Wei, Y.X.[Yu-Xiang], Zuo, W.M.[Wang-Meng], Wong, K.Y.K.[Kwan-Yee K.],
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis,
CVPR24(9264-9274)
IEEE DOI 2410
Adaptation models, Visualization, Image synthesis, Source coding, Semantics, Layout, Text to image, semantic image synthesis BibRef

Lv, Z.Y.[Zheng-Yao], Li, X.M.[Xiao-Ming], Niu, Z.X.[Zhen-Xing], Cao, B.[Bing], Zuo, W.M.[Wang-Meng],
Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis,
CVPR22(11204-11213)
IEEE DOI 2210
Adaptation models, Codes, Shape, Image synthesis, Convolution, Semantics, Image and video synthesis and generation BibRef

Shi, Y.P.[Yu-Peng], Liu, X.[Xiao], Wei, Y.X.[Yu-Xiang], Wu, Z.Q.[Zhong-Qin], Zuo, W.M.[Wang-Meng],
Retrieval-based Spatially Adaptive Normalization for Semantic Image Synthesis,
CVPR22(11214-11223)
IEEE DOI 2210
Training, Visualization, Image synthesis, Shape, Navigation, Semantics, Wheels, Image and video synthesis and generation BibRef

Shim, S.H.[Sang-Heon], Hyun, S.[Sangeek], Bae, D.H.[Dae-Hyun], Heo, J.P.[Jae-Pil],
Local Attention Pyramid for Scene Image Generation,
CVPR22(7764-7772)
IEEE DOI 2210
Measurement, Deep learning, Visualization, Image segmentation, Image analysis, Image synthesis, Scene analysis and understanding BibRef

Wang, B.[Bo], Wu, T.[Tao], Zhu, M.[Minfeng], Du, P.[Peng],
Interactive Image Synthesis with Panoptic Layout Generation,
CVPR22(7773-7782)
IEEE DOI 2210
Visualization, Image synthesis, Shape, Perturbation methods, Layout, Semantics, Genomics, Image and video synthesis and generation BibRef

Yang, Z.P.[Zuo-Peng], Liu, D.Q.[Da-Qing], Wang, C.Y.[Chao-Yue], Yang, J.[Jie], Tao, D.C.[Da-Cheng],
Modeling Image Composition for Complex Scene Generation,
CVPR22(7754-7763)
IEEE DOI 2210
Training, Measurement, Visualization, Image coding, Layout, Genomics, Predictive models, Image and video synthesis and generation BibRef

Jeong, J.[Jaebong], Jo, J.[Janghun], Cho, S.[Sunghyun], Park, J.[Jaesik],
3D Scene Painting via Semantic Image Synthesis,
CVPR22(2252-2262)
IEEE DOI 2210
Training, Solid modeling, Image color analysis, Image synthesis, Machine vision, Semantics, Vision applications and systems, Vision + graphics BibRef

Aldausari, N.[Nuha], Sowmya, A.[Arcot], Marcus, N.[Nadine], Mohammadi, G.[Gelareh],
Cascaded Siamese Self-supervised Audio to Video GAN,
MULA22(4690-4699)
IEEE DOI 2210
Solid modeling, Correlation, Computational modeling, Pattern recognition BibRef

Tao, M.[Ming], Tang, H.[Hao], Wu, F.[Fei], Jing, X.Y.[Xiao-Yuan], Bao, B.K.[Bing-Kun], Xu, C.S.[Chang-Sheng],
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis,
CVPR22(16494-16504)
IEEE DOI 2210
Visualization, Codes, Semantics, Generative adversarial networks, Generators, Vision+language, Image and video synthesis and generation BibRef

Zhou, Y.F.[Yu-Fan], Zhang, R.[Ruiyi], Chen, C.Y.[Chang-You], Li, C.Y.[Chun-Yuan], Tensmeyer, C.[Chris], Yu, T.[Tong], Gu, J.X.[Jiu-Xiang], Xu, J.H.[Jin-Hui], Sun, T.[Tong],
Towards Language-Free Training for Text-to-Image Generation,
CVPR22(17886-17896)
IEEE DOI 2210
Training, Image synthesis, Semantics, Training data, Tail, Data collection, Data models, Vision+language, Image and video synthesis and generation BibRef

Li, Z.H.[Zhi-Heng], Min, M.R.[Martin Renqiang], Li, K.[Kai], Xu, C.L.[Chen-Liang],
StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis,
CVPR22(18176-18186)
IEEE DOI 2210
Measurement, Ethics, Image synthesis, Computational modeling, Semantics, Robustness, Image and video synthesis and generation, Vision+language BibRef

Sanghi, A.[Aditya], Chu, H.[Hang], Lambourne, J.G.[Joseph G.], Wang, Y.[Ye], Cheng, C.Y.[Chin-Yi], Fumero, M.[Marco], Malekshan, K.R.[Kamal Rahimi],
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation,
CVPR22(18582-18592)
IEEE DOI 2210
Training, Point cloud compression, Shape, Semantics, Natural languages, Vision + graphics, Vision+language BibRef

Jain, A.[Ajay], Mildenhall, B.[Ben], Barron, J.T.[Jonathan T.], Abbeel, P.[Pieter], Poole, B.[Ben],
Zero-Shot Text-Guided Object Generation with Dream Fields,
CVPR22(857-866)
IEEE DOI 2210
Geometry, Visualization, Solid modeling, Image color analysis, Shape, Deep learning architectures and techniques, Vision applications and systems BibRef

Bazazian, D.[Dena], Calway, A.[Andrew], Damen, D.[Dima],
Dual-Domain Image Synthesis using Segmentation-Guided GAN,
NTIRE22(506-515)
IEEE DOI 2210
Hair, Training, Image segmentation, Codes, Semantics, Nose, Mouth BibRef

Yang, Y.Y.[Yu-Yan], Ni, X.[Xin], Hao, Y.B.[Yan-Bin], Liu, C.Y.[Chen-Yu], Wang, W.S.[Wen-Shan], Liu, Y.F.[Yi-Feng], Xi, H.Y.[Hai-Yong],
MF-GAN: Multi-conditional Fusion Generative Adversarial Network for Text-to-Image Synthesis,
MMMod22(I:41-53).
Springer DOI 2203
Best paper section BibRef

Wang, Y.[Yi], Qi, L.[Lu], Chen, Y.C.[Ying-Cong], Zhang, X.Y.[Xiang-Yu], Jia, J.Y.[Jia-Ya],
Image Synthesis via Semantic Composition,
ICCV21(13729-13738)
IEEE DOI 2203
Correlation, Image synthesis, Convolution, Semantics, Layout, Benchmark testing, Image and video synthesis, Neural generative models BibRef

Dhamo, H.[Helisa], Manhardt, F.[Fabian], Navab, N.[Nassir], Tombari, F.[Federico],
Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs,
ICCV21(16332-16341)
IEEE DOI 2203
Point cloud compression, Visualization, Solid modeling, Shape, Semantics, Scene analysis and understanding, BibRef

Li, Z.J.[Ze-Jian], Wu, J.Y.[Jing-Yu], Koh, I.[Immanuel], Tang, Y.C.[Yong-Chuan], Sun, L.Y.[Ling-Yun],
Image Synthesis from Layout with Locality-Aware Mask Adaption,
ICCV21(13799-13808)
IEEE DOI 2203
Adaptation models, Visualization, Image segmentation, Image synthesis, Computational modeling, Layout, Neural generative models BibRef

Qi, Y.G.[Yong-Gang], Su, G.Y.[Guo-Yao], Chowdhury, P.N.[Pinaki Nath], Li, M.K.[Ming-Kang], Song, Y.Z.[Yi-Zhe],
SketchLattice: Latticed Representation for Sketch Manipulation,
ICCV21(933-941)
IEEE DOI 2203
Image quality, Limiting, Computational modeling, Lattices, Task analysis, Vision + other modalities, Vision applications and systems BibRef

Yang, L.[Lan], Pang, K.Y.[Kai-Yue], Zhang, H.G.[Hong-Gang], Song, Y.Z.[Yi-Zhe],
SketchAA: Abstract Representation for Abstract Sketches,
ICCV21(10077-10086)
IEEE DOI 2203
Visualization, Image recognition, Codes, Computational modeling, Image retrieval, Rendering (computer graphics), Vision applications and systems BibRef

Canfes, Z.[Zehranaz], Atasoy, M.F.[M. Furkan], Dirik, A.[Alara], Yanardag, P.[Pinar],
Text and Image Guided 3D Avatar Generation and Manipulation,
WACV23(4410-4420)
IEEE DOI 2302
Solid modeling, Shape, Avatars, Source coding, Pipelines, Process control, Algorithms: 3D computer vision, Biometrics, face, body pose BibRef

Kocasari, U.[Umut], Dirik, A.[Alara], Tiftikci, M.[Mert], Yanardag, P.[Pinar],
StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation,
WACV22(3441-3450)
IEEE DOI 2202
Training, Hair, Codes, Image synthesis, Image color analysis, Semantics, Deep Learning BibRef

Xiang, X.Y.[Xiao-Yu], Liu, D.[Ding], Yang, X.[Xiao], Zhu, Y.H.[Yi-Heng], Shen, X.H.[Xiao-Hui], Allebach, J.P.[Jan P.],
Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis,
WACV22(944-954)
IEEE DOI 2202
Training, Image color analysis, Training data, Distortion, Generators, Optimization, Image and Video Synthesis BibRef

Ivgi, M.[Maor], Benny, Y.[Yaniv], Ben-David, A.[Avichai], Berant, J.[Jonathan], Wolf, L.B.[Lior B.],
Scene Graph To Image Generation with Contextualized Object Layout Refinement,
ICIP21(2428-2432)
IEEE DOI 2201
Image synthesis, Layout, Predictive models, Task analysis, Context modeling, Image Synthesis, Scene Graph, GAN BibRef

Jeon, E.[Eunyeong], Kim, K.[Kunhee], Kim, D.J.[Dai-Jin],
FA-GAN: Feature-Aware GAN for Text to Image Synthesis,
ICIP21(2443-2447)
IEEE DOI 2201
Image synthesis, Natural languages, Generative adversarial networks, Feature extraction, Generators, Feature-Aware GAN BibRef

Zhang, Z.Q.[Zhi-Qiang], Yu, W.X.[Wen-Xin], Jiang, N.[Ning], Zhou, J.J.[Jin-Jia],
Text To Image Synthesis With Erudite Generative Adversarial Networks,
ICIP21(2438-2442)
IEEE DOI 2201
Image synthesis, Generative adversarial networks, Data models, Task analysis, Text-to-Image Synthesis, Generative Adversarial Networks BibRef

Yuan, S.Z.[Shao-Zu], Dai, A.[Aijun], Yan, Z.L.[Zhi-Ling], Guo, Z.[Zehua], Liu, R.X.[Rui-Xue], Chen, M.[Meng],
SketchBird: Learning to Generate Bird Sketches from Text,
SHE21(2443-2452)
IEEE DOI 2112
Fuses, Shape, Error analysis, Image edge detection, Computational modeling BibRef

Berardi, G.[Gianluca], Salti, S.[Samuele], di Stefano, L.[Luigi],
SketchyDepth: from Scene Sketches to RGB-D Images,
SHE21(2414-2423)
IEEE DOI 2112
Training, Geometry, Image synthesis, Annotations, Conferences BibRef

Lu, X.P.[Xiao-Peng], Ng, L.[Lynnette], Fernandez, J.[Jared], Zhu, H.[Hao],
CIGLI: Conditional Image Generation from Language & Image,
CLVL21(3127-3131)
IEEE DOI 2112
Codes, Image synthesis, Computational modeling, Semantics, Cognition BibRef

Dorkenwald, M.[Michael], Milbich, T.[Timo], Blattmann, A.[Andreas], Rombach, R.[Robin], Derpanis, K.G.[Konstantinos G.], Ommer, B.[Björn],
Stochastic Image-to-Video Synthesis using cINNs,
CVPR21(3741-3752)
IEEE DOI 2111
Neural networks, Stochastic processes, Process control, Predictive models, Probabilistic logic BibRef

Zhang, H.[Han], Koh, J.Y.[Jing Yu], Baldridge, J.[Jason], Lee, H.L.[Hong-Lak], Yang, Y.F.[Yin-Fei],
Cross-Modal Contrastive Learning for Text-to-Image Generation,
CVPR21(833-842)
IEEE DOI 2111
Image quality, Image synthesis, Computational modeling, Impedance matching, Semantics, Natural languages, Generative adversarial networks BibRef

Koh, J.Y.[Jing Yu], Baldridge, J.[Jason], Lee, H.L.[Hong-Lak], Yang, Y.F.[Yin-Fei],
Text-to-Image Generation Grounded by Fine-Grained User Attention,
WACV21(237-246)
IEEE DOI 2106
Measurement, Image segmentation, Visualization, Grounding, Natural languages BibRef

Long, J.[Jia], Lu, H.T.[Hong-Tao],
Multi-level Gate Feature Aggregation with Spatially Adaptive Batch-instance Normalization for Semantic Image Synthesis,
MMMod21(I:378-390).
Springer DOI 2106
BibRef

Yan, J.W.[Jia-Wei], Lin, C.S.[Ci-Siang], Yang, F.E.[Fu-En], Li, Y.J.[Yu-Jhe], Wang, Y.C.A.F.[Yu-Chi-Ang Frank],
Semantics-Guided Representation Learning with Applications to Visual Synthesis,
ICPR21(7181-7187)
IEEE DOI 2105
Visualization, Interpolation, Computational modeling, Semantics, Data visualization, Semantic interpolation BibRef

Tang, S.C.[Shi-Chang], Zhou, X.[Xu], He, X.M.[Xu-Ming], Ma, Y.[Yi],
Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective,
ICPR21(10042-10049)
IEEE DOI 2105
Training, Image synthesis, Image color analysis, Mutual information BibRef

Ji, Z.Y.[Zhong-Yi], Wang, W.M.[Wen-Min], Chen, B.Y.[Bao-Yang], Han, X.[Xiao],
Text-to-Image Generation via Semi-Supervised Training,
VCIP20(265-268)
IEEE DOI 2102
image classification, learning (artificial intelligence), text analysis, visual databases, text-to-image generation, Pseudo Feature BibRef

Devaranjan, J.[Jeevan], Kar, A.[Amlan], Fidler, S.[Sanja],
Meta-SIM2: Unsupervised Learning of Scene Structure for Synthetic Data Generation,
ECCV20(XVII:715-733).
Springer DOI 2011

WWW Link. BibRef

Song, Y.Z.[Yun-Zhu], Tam, Z.R.[Zhi Rui], Chen, H.J.[Hung-Jen], Lu, H.H.[Huiao-Han], Shuai, H.H.[Hong-Han],
Character-preserving Coherent Story Visualization,
ECCV20(XVII:18-33).
Springer DOI 2011
BibRef

Achituve, I.[Idan], Maron, H.[Haggai], Chechik, G.[Gal],
Self-Supervised Learning for Domain Adaptation on Point Clouds,
WACV21(123-133)
IEEE DOI 2106
Phase change materials, Training, Task analysis BibRef

Herzig, R.[Roei], Bar, A.[Amir], Xu, H.J.[Hui-Juan], Chechik, G.[Gal], Darrell, T.J.[Trevor J.], Globerson, A.[Amir],
Learning Canonical Representations for Scene Graph to Image Generation,
ECCV20(XXVI:210-227).
Springer DOI 2011
BibRef

Zheng, H.T.[Hai-Tian], Liao, H.[Haofu], Chen, L.[Lele], Xiong, W.[Wei], Chen, T.L.[Tian-Lang], Luo, J.B.[Jie-Bo],
Example-guided Image Synthesis Using Masked Spatial-channel Attention and Self-supervision,
ECCV20(XIV:422-439).
Springer DOI 2011
BibRef

Mallya, A.[Arun], Wang, T.C.[Ting-Chun], Sapra, K.[Karan], Liu, M.Y.[Ming-Yu],
World-Consistent Video-to-Video Synthesis,
ECCV20(VIII:359-378).
Springer DOI 2011
BibRef

Vo, D.M.[Duc Minh], Sugimoto, A.[Akihiro],
Visual-relation Conscious Image Generation from Structured-text,
ECCV20(XXVIII:290-306).
Springer DOI 2011
BibRef

Burns, A.[Andrea], Kim, D.H.[Dong-Hyun], Wijaya, D.[Derry], Saenko, K.[Kate], Plummer, B.A.[Bryan A.],
Learning to Scale Multilingual Representations for Vision-Language Tasks,
ECCV20(IV:197-213).
Springer DOI 2011
BibRef

Liang, J.D.[Jia-Dong], Pei, W.J.[Wen-Jie], Lu, F.[Feng],
Cpgan: Content-parsing Generative Adversarial Networks for Text-to-image Synthesis,
ECCV20(IV:491-508).
Springer DOI 2011
BibRef

Nawhal, M.[Megha], Zhai, M.Y.[Meng-Yao], Lehrmann, A.[Andreas], Sigal, L.[Leonid], Mori, G.[Greg],
Generating Videos of Zero-shot Compositions of Actions and Objects,
ECCV20(XII: 382-401).
Springer DOI 2010
BibRef

Huang, H.P.[Hsin-Ping], Tseng, H.Y.[Hung-Yu], Lee, H.Y.[Hsin-Ying], Huang, J.B.[Jia-Bin],
Semantic View Synthesis,
ECCV20(XII: 592-608).
Springer DOI 2010
BibRef

Zhu, Z.[Zhen], Xu, Z.L.[Zhi-Liang], You, A.S.[An-Sheng], Bai, X.[Xiang],
Semantically Multi-Modal Image Synthesis,
CVPR20(5466-5475)
IEEE DOI 2008
Semantics, Task analysis, Convolutional codes, Image generation, Decoding, Generators, Controllability BibRef

Luo, A., Zhang, Z., Wu, J., Tenenbaum, J.B.,
End-to-End Optimization of Scene Layout,
CVPR20(3753-3762)
IEEE DOI 2008
Layout, Semantics, Decoding, Rendering (computer graphics), Solid modeling, Training BibRef

Gao, C., Liu, Q., Xu, Q., Wang, L., Liu, J., Zou, C.,
SketchyCOCO: Image Generation From Freehand Scene Sketches,
CVPR20(5173-5182)
IEEE DOI 2008
Image edge detection, Image generation, Training, Data models, Semantics, Image segmentation BibRef

Chen, Q., Wu, Q., Tang, R., Wang, Y., Wang, S., Tan, M.,
Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only,
CVPR20(12622-12631)
IEEE DOI 2008
Layout, Buildings, Linguistics, Task analysis, Solid modeling BibRef

Liu, C., Mao, Z., Zhang, T., Xie, H., Wang, B., Zhang, Y.,
Graph Structured Network for Image-Text Matching,
CVPR20(10918-10927)
IEEE DOI 2008
Visualization, Dogs, Semantics, Sparse matrices, Image edge detection, Learning systems, Feature extraction BibRef

Sarafianos, N., Xu, X., Kakadiaris, I.,
Adversarial Representation Learning for Text-to-Image Matching,
ICCV19(5813-5823)
IEEE DOI 2004
image matching, image representation, learning (artificial intelligence), Adversarial representation, Distance measurement BibRef

Tan, F.[Fuwen], Feng, S.[Song], Ordonez, V.[Vicente],
Text2Scene: Generating Compositional Scenes From Textual Descriptions,
CVPR19(6703-6712).
IEEE DOI 2002
BibRef

Yin, G.J.[Guo-Jun], Liu, B.[Bin], Sheng, L.[Lu], Yu, N.H.[Neng-Hai], Wang, X.G.[Xiao-Gang], Shao, J.[Jing],
Semantics Disentangling for Text-To-Image Generation,
CVPR19(2322-2331).
IEEE DOI 2002
BibRef

Li, W.B.[Wen-Bo], Zhang, P.C.[Peng-Chuan], Zhang, L.[Lei], Huang, Q.Y.[Qiu-Yuan], He, X.D.[Xiao-Dong], Lyu, S.W.[Si-Wei], Gao, J.F.[Jian-Feng],
Object-Driven Text-To-Image Synthesis via Adversarial Training,
CVPR19(12166-12174).
IEEE DOI 2002
BibRef

Talavera, A., Tan, D.S., Azcarraga, A., Hua, K.,
Layout and Context Understanding for Image Synthesis with Scene Graphs,
ICIP19(1905-1909)
IEEE DOI 1910
Generative Models, Text-to-Image Synthesis, Scene Graphs BibRef

Joseph, K.J., Pal, A.[Arghya], Rajanala, S.[Sailaja], Balasubramanian, V.N.[Vineeth N.],
C4Synth: Cross-Caption Cycle-Consistent Text-to-Image Synthesis,
WACV19(358-366)
IEEE DOI 1904
image capture, image processing, virtual reality, visual databases, image editing, virtual reality, plausible image, Data models BibRef

Zhang, Z., Xie, Y., Yang, L.,
Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network,
CVPR18(6199-6208)
IEEE DOI 1812
Generators, Training, Image resolution, Task analysis, Semantics, Measurement BibRef

Qi, X., Chen, Q., Jia, J.Y.[Jia-Ya], Koltun, V.,
Semi-Parametric Image Synthesis,
CVPR18(8808-8816)
IEEE DOI 1812
Image segmentation, Semantics, Layout, Training, Image generation, Image color analysis, Pipelines BibRef

Hong, S.H.[Seung-Hoon], Yang, D.D.[Ding-Dong], Choi, J.[Jongwook], Lee, H.L.[Hong-Lak],
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis,
CVPR18(7986-7994)
IEEE DOI 1812
Layout, Generators, Semantics, Shape, Image generation, Task analysis BibRef

Sah, S., Peri, D., Shringi, A., Zhang, C., Dominguez, M., Savakis, A., Ptucha, R.,
Semantically Invariant Text-to-Image Generation,
ICIP18(3783-3787)
IEEE DOI 1809
Measurement, Image generation, Generators, Image quality, Detectors, Visualization, Cost function BibRef

Kong, C.[Chen], Lin, D.[Dahua], Bansal, M.[Mohit], Urtasun, R.[Raquel], Fidler, S.[Sanja],
What Are You Talking About? Text-to-Image Coreference,
CVPR14(3558-3565)
IEEE DOI 1409
3D object detection; Text and images; scene understanding BibRef

Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Diffusion for Description or Text to Image Generation .


Last update:Jan 20, 2025 at 11:36:25