Peng, Y.X.[Yu-Xin],
Qi, J.W.[Jin-Wei],
Show and Tell in the Loop: Cross-Modal Circular Correlation Learning,
MultMed(21), No. 6, June 2019, pp. 1538-1550.
IEEE DOI
1906
Correlation, Bridges, Logic gates, Semantics, Task analysis, Cognition,
Feeds, Circular correlation learning, cross-modal retrieval,
text-to-image synthesis
BibRef
Baraheem, S.S.[Samah S.],
Nguyen, T.V.[Tam V.],
Text-to-image via mask anchor points,
PRL(133), 2020, pp. 25-32.
Elsevier DOI
2005
Text-to-image, Mask dataset, Image synthesis, Anchor points
BibRef
Osahor, U.,
Kazemi, H.,
Dabouei, A.,
Nasrabadi, N.,
Quality Guided Sketch-to-Photo Image Synthesis,
Biometrics20(3575-3584)
IEEE DOI
2008
Pattern recognition
BibRef
Yuan, M.,
Peng, Y.,
Bridge-GAN: Interpretable Representation Learning for Text-to-Image
Synthesis,
CirSysVideo(30), No. 11, November 2020, pp. 4258-4268.
IEEE DOI
2011
Visualization, Mutual information, Image synthesis, Task analysis,
Training, Bridge circuits, Semantics, Text-to-image synthesis,
Bridge-GAN
BibRef
Hu, T.[Tao],
Long, C.J.[Cheng-Jiang],
Xiao, C.X.[Chun-Xia],
A Novel Visual Representation on Text Using Diverse Conditional GAN
for Visual Recognition,
IP(30), 2021, pp. 3499-3512.
IEEE DOI
2103
Use text from social media to train image recognition.
Visualization, Feature extraction, Image recognition,
Text recognition, Generators,
visual recognition
BibRef
Yang, C.Y.[Ce-Yuan],
Shen, Y.J.[Yu-Jun],
Zhou, B.L.[Bo-Lei],
Semantic Hierarchy Emerges in Deep Generative Representations for Scene
Synthesis,
IJCV(129), No. 5, May 2021, pp. 1451-1466.
Springer DOI
2105
BibRef
Wu, F.X.[Fu-Xiang],
Cheng, J.[Jun],
Wang, X.C.[Xin-Chao],
Wang, L.[Lei],
Tao, D.P.[Da-Peng],
Image Hallucination From Attribute Pairs,
Cyber(52), No. 1, January 2022, pp. 568-581.
IEEE DOI
2201
Semantics, Visualization, Generators, Syntactics,
Training, Natural language processing, text-to-image synthesis
BibRef
Hinz, T.[Tobias],
Heinrich, S.[Stefan],
Wermter, S.[Stefan],
Semantic Object Accuracy for Generative Text-to-Image Synthesis,
PAMI(44), No. 3, March 2022, pp. 1552-1565.
IEEE DOI
2202
Layout, Semantics, Measurement, Generators, Image resolution,
Image quality, Text-to-image synthesis,
generative models
BibRef
Gu, J.J.[Jin-Jing],
Wang, H.L.[Han-Li],
Fan, R.C.[Rui-Chao],
Coherent Visual Storytelling via Parallel Top-Down Visual and Topic
Attention,
CirSysVideo(33), No. 1, January 2023, pp. 257-268.
IEEE DOI
2301
Visualization, Decoding, Neural networks, Coherence, Task analysis,
Image sequences, Feature extraction, Visual storytelling,
phrase beam search
BibRef
Li, T.P.[Teng-Peng],
Wang, H.L.[Han-Li],
He, B.[Bin],
Chen, C.W.[Chang Wen],
Knowledge-Enriched Attention Network With Group-Wise Semantic for
Visual Storytelling,
PAMI(45), No. 7, July 2023, pp. 8634-8645.
IEEE DOI
2306
Visualization, Semantics, Feature extraction, Decoding,
Streaming media, GSM, Technological innovation, Encoder-decoder,
visual storytelling
BibRef
Hou, X.X.[Xian-Xu],
Zhang, X.K.[Xiao-Kang],
Li, Y.D.[Yu-Dong],
Shen, L.L.[Lin-Lin],
TextFace: Text-to-Style Mapping Based Face Generation and
Manipulation,
MultMed(25), 2023, pp. 3409-3419.
IEEE DOI
2309
BibRef
Gao, L.L.[Lian-Li],
Zhao, Q.[Qike],
Zhu, J.C.[Jun-Chen],
Su, S.[Sitong],
Cheng, L.C.[Le-Chao],
Zhao, L.[Lei],
From External to Internal: Structuring Image for Text-to-Image
Attributes Manipulation,
MultMed(25), 2023, pp. 7248-7261.
IEEE DOI Code:
WWW Link.
2311
BibRef
Sun, M.Z.[Ming-Zhen],
Wang, W.N.[Wei-Ning],
Zhu, X.X.[Xin-Xin],
Liu, J.[Jing],
Reparameterizing and dynamically quantizing image features for image
generation,
PR(146), 2024, pp. 109962.
Elsevier DOI
2311
Vector quantization, Variational auto-encoder,
Unconditional image generation, Text-to-image generation,
Autoregressive generation
BibRef
Tang, Z.M.[Zheng-Mi],
Miyazaki, T.[Tomo],
Omachi, S.[Shinichiro],
A Scene-Text Synthesis Engine Achieved Through Learning From
Decomposed Real-World Data,
IP(32), 2023, pp. 5837-5851.
IEEE DOI Code:
WWW Link.
2311
BibRef
Xu, Y.H.[Yong-Hao],
Yu, W.[Weikang],
Ghamisi, P.[Pedram],
Kopp, M.[Michael],
Hochreiter, S.[Sepp],
Txt2Img-MHN: Remote Sensing Image Generation From Text Using Modern
Hopfield Networks,
IP(32), 2023, pp. 5737-5750.
IEEE DOI Code:
WWW Link.
2311
BibRef
Zhou, Y.[Yan],
Qian, J.C.[Jie-Chang],
Zhang, H.D.[Huai-Dong],
Xu, X.M.[Xue-Miao],
Sun, H.J.[Hua-Jie],
Zeng, F.Z.[Fan-Zhi],
Zhou, Y.X.[Yue-Xia],
Adaptive multi-text union for stable text-to-image synthesis learning,
PR(152), 2024, pp. 110438.
Elsevier DOI
2405
Adaptive multi-text union learning, Text-to-image synthesis,
Cross-modal generation
BibRef
Tan, H.C.[Hong-Chen],
Yin, B.C.[Bao-Cai],
Xu, K.Q.[Kai-Qiang],
Wang, H.S.[Hua-Sheng],
Liu, X.P.[Xiu-Ping],
Li, X.[Xin],
Attention-Bridged Modal Interaction for Text-to-Image Generation,
CirSysVideo(34), No. 7, July 2024, pp. 5400-5413.
IEEE DOI
2407
Semantics, Task analysis, Visualization, Computational modeling,
Image synthesis, Generators, Layout,
residual perception discriminator
BibRef
Baraheem, S.S.[Samah S.],
Nguyen, T.V.[Tam V.],
S5: Sketch-to-Image Synthesis via Scene and Size Sensing,
MultMedMag(31), No. 2, April 2024, pp. 7-16.
IEEE DOI
2408
Image synthesis, Instance segmentation, Feature extraction,
Semantics, Image edge detection, Task analysis, Image analysis
BibRef
Wu, Z.Y.[Zhen-Yu],
Wang, Z.W.[Zi-Wei],
Liu, S.Y.[Sheng-Yu],
Luo, H.[Hao],
Lu, J.W.[Ji-Wen],
Yan, H.B.[Hai-Bin],
FairScene: Learning unbiased object interactions for indoor scene
synthesis,
PR(156), 2024, pp. 110737.
Elsevier DOI
2408
Indoor scene synthesis, Graph neural networks, Causal inference
BibRef
Li, Z.Y.[Zhuo-Yuan],
Sun, Y.[Yi],
Parameter efficient finetuning of text-to-image models with trainable
self-attention layer,
IVC(151), 2024, pp. 105296.
Elsevier DOI
2411
T2I models, Efficient finetuning, Attention control
BibRef
Croitoru, F.A.[Florinel-Alin],
Hondru, V.[Vlad],
Ionescu, R.T.[Radu Tudor],
Shah, M.[Mubarak],
Reverse Stable Diffusion: What prompt was used to generate this
image?,
CVIU(249), 2024, pp. 104210.
Elsevier DOI Code:
WWW Link.
2412
Diffusion models, Reverse engineering,
Image-to-prompt prediction, Text-to-image generation
BibRef
Ibarrola, F.[Francisco],
Lulham, R.[Rohan],
Grace, K.[Kazjon],
Affect-Conditioned Image Generation,
AffCom(15), No. 4, October 2024, pp. 2169-2179.
IEEE DOI
2412
Training, Semantics, Predictive models, Creativity,
Computational modeling, Task analysis, Neural networks,
semantic models
BibRef
Li, A.[Ailin],
Zhao, L.[Lei],
Zuo, Z.W.[Zhi-Wen],
Xing, W.[Wei],
Lu, D.M.[Dong-Ming],
Specific Diverse Text-to-Image Synthesis via Exemplar Guidance,
MultMedMag(31), No. 4, October 2024, pp. 37-48.
IEEE DOI
2501
Visualization, Task analysis, Semantics, Image synthesis, Generators,
Training, Vectors
BibRef
Zhang, Y.[Yue],
Peng, C.T.[Cheng-Tao],
Wang, Q.[Qiuli],
Song, D.[Dan],
Li, K.Y.[Kai-Yan],
Zhou, S.K.[S. Kevin],
Unified Multi-Modal Image Synthesis for Missing Modality Imputation,
MedImg(44), No. 1, January 2025, pp. 4-18.
IEEE DOI
2501
Image synthesis, Imputation, Medical diagnostic imaging,
Task analysis, Streams, Training, Feature extraction, data imputation
BibRef
Zhou, Y.P.[Yu-Peng],
Zhou, D.Q.[Da-Quan],
Wang, Y.X.[Ya-Xing],
Feng, J.S.[Jia-Shi],
Hou, Q.B.[Qi-Bin],
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask,
IJCV(133), No. 5, May 2025, pp. 2805-2824.
Springer DOI
2504
BibRef
Xiong, H.[Huolin],
Li, Z.K.[Ze-Kun],
Lv, Q.[Qunbo],
Zhu, B.Y.[Bao-Yu],
Zhang, Y.[Yu],
Yu, C.Y.[Chao-Yang],
Tan, Z.[Zheng],
OP-Gen: A High-Quality Remote Sensing Image Generation Algorithm
Guided by OSM Images and Textual Prompts,
RS(17), No. 7, 2025, pp. 1226.
DOI Link
2504
BibRef
Peng, D.[Duo],
Ke, Q.H.[Qiu-Hong],
Huang, M.H.[Mark He],
Hu, P.[Ping],
Liu, J.[Jun],
Unified Prompt Attack Against Text-to-Image Generation Models,
PAMI(47), No. 6, June 2025, pp. 4816-4834.
IEEE DOI
2505
Optimization, Closed box, Visualization, Training, Adaptation models,
Semantics, Security, Text to image, Glass box, naturalness
BibRef
Wang, Q.H.[Qing-He],
Li, B.[Baolu],
Li, X.M.[Xiao-Min],
Cao, B.[Bing],
Ma, L.Q.[Li-Qian],
Lu, H.C.[Hu-Chuan],
Jia, X.[Xu],
CharacterFactory: Sampling Consistent Characters With GANs for
Diffusion Models,
IP(34), 2025, pp. 2544-2559.
IEEE DOI Code:
WWW Link.
2505
Focus on the person (character).
Diffusion models, Training, Text to image, Optimization,
Noise reduction, Character generation, Image synthesis, character creation
BibRef
Wang, W.[Wen],
Zhao, C.Y.[Can-Yu],
Chen, H.[Hao],
Chen, Z.K.[Zhe-Kai],
Zheng, K.C.[Ke-Cheng],
Shen, C.H.[Chun-Hua],
AutoStory: Generating Diverse Storytelling Images with Minimal Human
Efforts,
IJCV(133), No. 6, June 2025, pp. 3083-3104.
Springer DOI
2505
BibRef
Zhang, H.W.[Huai-Wen],
Wu, T.[Tianci],
Wei, Y.W.[Yin-Wei],
Multi-View User Preference Modeling for Personalized Text-to-Image
Generation,
MultMed(27), 2025, pp. 3082-3091.
IEEE DOI
2506
Adaptation models, User preference, Text to image,
Large language models, Training, Analytical models
BibRef
Deng, K.[Kai],
Wei, S.Y.[Si-Yuan],
Pang, S.Y.[Shi-Yan],
Jiang, H.[Huiwei],
Su, B.[Bo],
Synthesizing Remote Sensing Images from Land Cover Annotations via
Graph Prior Masked Diffusion,
RS(17), No. 13, 2025, pp. 2254.
DOI Link
2507
BibRef
Lei, S.Y.[Shi-Ye],
Chen, H.[Hao],
Zhang, S.[Sen],
Zhao, B.[Bo],
Tao, D.C.[Da-Cheng],
Image Captions are Natural Prompts for Training Data Synthesis,
IJCV(133), No. 8, August 2025, pp. 5435-5454.
Springer DOI
2508
Use given caption to generate image.
BibRef
Li, X.[Xiao],
Chen, L.Q.[Li-Quan],
Fu, T.[Tong],
Fu, Z.J.[Zhang-Jie],
Gao, Y.[Yuan],
Coverless Image Steganography Based on Semantic-Controlled
Text-to-Image Generation,
CirSysVideo(35), No. 8, August 2025, pp. 8391-8405.
IEEE DOI
2508
Semantics, Text to image, Steganography, Security, Costs,
Artificial intelligence, Visualization, Receivers,
artificial intelligence generated content (AIGC)
BibRef
Ma, Z.[Zehong],
Chen, H.[Hao],
Zeng, W.[Wei],
Su, L.M.[Li-Min],
Zhang, S.L.[Shi-Liang],
Multi-Modal Reference Learning for Fine-Grained Text-to-Image
Retrieval,
MultMed(27), 2025, pp. 5009-5022.
IEEE DOI
2509
Visualization, Text to image, Representation learning, Feature extraction,
Training, Image retrieval, Semantics, Aggregates, proxy learning
BibRef
Qazi, T.[Tayeba],
Lall, B.[Brejesh],
Mukherjee, P.[Prerana],
ThermalDiff: A diffusion architecture for thermal image synthesis,
JVCIR(111), 2025, pp. 104524.
Elsevier DOI
2509
Thermal image synthesis, Diffusion models, Image generation,
Synthetic images, Infrared image estimation
BibRef
Tian, Y.[Yu],
Liu, Y.[Yue],
Wang, S.Q.[Shi-Qi],
Kwong, S.[Sam],
Quality Assessment for Text-to-Image Generation: A Survey,
MultMedMag(32), No. 2, April 2025, pp. 44-52.
IEEE DOI
2510
Survey, Text to Image. Quality assessment, Image color analysis, Text to image,
Measurement, Image quality, Feature extraction, Toxicology, Surveys,
Visualization
BibRef
Xu, M.L.[Meng-Ling],
Tao, M.[Ming],
Wang, J.[Jie],
Bao, B.K.[Bing-Kun],
SD-Prompt: Learnable and Adaptive Prompts for Enhancing
Subject-Driven Text-to-Image Synthesis,
MultMedMag(32), No. 3, July 2025, pp. 94-104.
IEEE DOI
2510
Text to image, Computational modeling, Transformers, Manuals,
Adaptation models, Uncertainty, Tuning, Training, Image synthesis, Computational efficiency
BibRef
D'Incà, M.[Moreno],
Peruzzo, E.[Elia],
Mancini, M.[Massimiliano],
Xu, X.Q.[Xing-Qian],
Shi, H.[Humphrey],
Sebe, N.[Nicu],
GradBias: Unveiling Word Influence on Bias in Text-to-Image
Generative Models,
PAMI(47), No. 11, November 2025, pp. 9863-9875.
IEEE DOI
2510
Pipelines, Correlation, Portable computers, Ethnicity, Training,
Foundation models, Analytical models, Text to image, Standards, bias
BibRef
Zheng, J.W.[Jian-Wei],
Xu, N.[Ni],
Li, W.[Wei],
Jiang, J.W.[Jia-Wei],
Zhang, X.Q.[Xiao-Qin],
Semantic-Spatial Attention for Refined Object Placement in
Text-to-Image Synthesis,
MultMed(27), 2025, pp. 7255-7270.
IEEE DOI
2510
Layout, Dogs, Diffusion models, Image synthesis, Text to image, Visualization,
Semantics, Proposals, Noise reduction, Coherence, cross-attention
BibRef
Dong, Z.Y.[Zi-Yi],
Wei, P.X.[Peng-Xu],
Lin, L.[Liang],
DreamArtist: Controllable One-Shot Text-to-Image Generation via
Positive-Negative Adapter,
IJCV(133), No. 10, October 2025, pp. 7037-7053.
Springer DOI
2511
BibRef
Choi, D.[Dooho],
Sung, Y.[Yunsick],
PixTention: Dynamic pixel-level adapter using attention maps,
IVC(163), 2025, pp. 105746.
Elsevier DOI
2511
LoRA, Adapter retrieval, Adaptation, Image generation, Diffusion
BibRef
Townsell, D.[Douglas],
Chen, L.W.[Ling-Wei],
Xie, M.[Mimi],
Pan, C.[Chen],
Zhang, W.[Wen],
STARS: Semantics-Aware Text-guided Aerial Image Refinement and
Synthesis,
CVIU(262), 2025, pp. 104561.
Elsevier DOI
2512
Aerial image synthesis, Image retrieval, Object retrieval, Diffusion
BibRef
Han, Y.X.[Yue-Xing],
Ruan, L.H.[Li-Heng],
Wang, B.[Bing],
Few-shot image generation via information transfer from the built
Geodesic surface,
PR(172), 2026, pp. 112293.
Elsevier DOI
2512
Few-shot image generation, GAN, The shape space theory, Data augmentation
BibRef
Hou, X.Y.[Xin-Yu],
Li, X.M.[Xiao-Ming],
Loy, C.C.[Chen Change],
AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation,
IJCV(134), No. 1, January 2026, pp. 108.
Springer DOI
WWW Link.
2602
BibRef
Chen, H.B.[Hai-Bo],
Zuo, Z.W.[Zhi-Wen],
Zhao, L.[Lei],
Li, J.[Jun],
Yang, J.[Jian],
ConceptCraft: One-Shot Personalized Text-to-Image Generation via
Object-Background Disentanglement,
CirSysVideo(36), No. 1, January 2026, pp. 133-146.
IEEE DOI
2602
Text to image, Diffusion models, Training, Object recognition,
Image synthesis, Videos, Circuits and systems, Visualization, Tuning,
identifier regularization scheme
BibRef
Xing, P.[Peng],
Wang, N.[Ning],
Sun, Y.P.[Yan-Peng],
Tang, J.H.[Jin-Hui],
Li, Z.C.[Ze-Chao],
Refine, Control and Distill: A Text-to-Image Framework for Faithful
Image Generation,
PAMI(48), No. 3, March 2026, pp. 2296-2311.
IEEE DOI
2602
Text to image, Noise reduction, Image synthesis, Diffusion models,
Birds, Semantics, Dogs, Layout, Toy manufacturing industry,
text-to-image diffusion model
BibRef
Liu, C.[Chang],
Li, R.[Rui],
Zhang, K.[Kaidong],
Lan, Y.W.[Yun-Wei],
Luo, X.[Xin],
Liu, D.[Dong],
LaCon: Late-Constraint Controllable Visual Generation,
IP(35), 2026, pp. 1111-1126.
IEEE DOI
2602
Visualization, Noise, Diffusion models, Process control,
Image color analysis, Transformers, Noise measurement,
text-to-image generation
BibRef
Yang, Y.[Yunuo],
Cheng, Y.W.[You-Wei],
Hu, J.L.[Jin-Long],
Xia, Y.[Yan],
Zang, Y.[Yu],
Text2AIRS: Fine-Grained Airplane Image Generation in Remote Sensing
from Nature Language,
RS(18), No. 3, 2026, pp. 511.
DOI Link
2602
Generate aerial images.
BibRef
Ma, X.L.[Xin-Liang],
Luo, J.W.[Jun-Wei],
Ni, S.P.[Shui-Ping],
Zhang, X.H.[Xiao-Hong],
Ding, R.Z.[Run-Ze],
SDLS: A Two-Stream Architecture with Self-Distillation and Local
Streams for Remote Sensing Image Scene Classification,
RS(18), No. 3, 2026, pp. 498.
DOI Link
2602
BibRef
Lu, P.[Peiyu],
Li, X.X.[Xiao-Xu],
Zhu, R.[Rui],
Ma, Z.Y.[Zhan-Yu],
Cao, J.[Jie],
Xue, J.H.[Jing-Hao],
Fine-Tuning via Linked Domains: A Closed-Form Dual Alignment
Mechanism for Transferring Vision-Language Models,
CirSysVideo(36), No. 3, March 2026, pp. 3613-3623.
IEEE DOI Code:
WWW Link.
2603
Computational modeling, Dams, Visualization, Adaptation models,
Text to image, Predictive models, Closed-form solutions, Videos,
feature alignment
BibRef
Shi, L.[Liang],
Zhang, J.[Jie],
Shan, S.G.[Shi-Guang],
Anonymization Prompt Learning for Facial Privacy-Preserving
Text-to-Image Generation,
IJCV(134), No. 4, April 2026, pp. 192.
Springer DOI
2603
BibRef
Liu, D.Y.[Dong-Yang],
Xin, Y.[Yi],
Zhao, S.T.[Shi-Tian],
Zhuo, L.[Le],
Lin, W.F.[Wei-Feng],
Li, X.Y.[Xin-Yue],
Qin, Q.[Qi],
Zhai, G.T.[Guang-Tao],
Liu, X.H.[Xiao-Hong],
Li, H.S.[Hong-Sheng],
Qiao, Y.[Yu],
Gao, P.[Peng],
Lumina-mGPT: Flexible Photorealistic Autoregressive Text-to-Image
Generation,
IJCV(134), No. 4, April 2026, pp. 141.
Springer DOI
2603
BibRef
Chen, G.[Gordon],
Huang, Z.Q.[Zi-Qi],
Tan, C.[Cheston],
Liu, Z.W.[Zi-Wei],
Stencil: Subject-Driven Generation with Context Guidance,
ICIP25(719-724)
IEEE DOI
2601
Image quality, Visualization, Technological innovation,
Image resolution, Computational modeling, Text to image,
Subject-Driven Generation
BibRef
Ni, Y.[Yao],
Wen, S.[Song],
Koniusz, P.[Piotr],
Cherian, A.[Anoop],
Noise Consistency Regularization for Improved Subject-Driven Image
Synthesis,
SyntaGen25(3107-3117)
IEEE DOI
2512
Adaptation models, Visualization, Codes, Image synthesis, Noise,
Text to image, Predictive models, Robustness, Overfitting, Generative AI
BibRef
Campi, R.[Riccardo],
Borrego, S.[Santiago],
de Santis, A.[Antonio],
Bianchi, M.[Matteo],
Tocchetti, A.[Andrea],
Brambilla, M.[Marco],
Towards Synthetic Concept Activation Vectors via Generative Models,
XAI4CV25(2711-2719)
IEEE DOI
2512
Training, Phase measurement, Explainable AI, Natural languages,
Measurement uncertainty, Text to image, Quality control, Vectors,
multimodal xai
BibRef
Fallah, F.[Forouzan],
Patel, M.[Maitreya],
Chatterjee, A.[Agneet],
Morariu, V.I.[Vlad I.],
Baral, C.[Chitta],
Yang, Y.Z.[Ye-Zhou],
Textinvision: Text and Prompt Complexity Driven Visual Text
Generation Benchmark,
AIBench25(525-534)
IEEE DOI Code:
WWW Link.
2512
Visualization, Analytical models, Accuracy, Text to image,
Production, Benchmark testing, Rendering (computer graphics),
diffusion-based text-to-image models
BibRef
Agarwal, A.[Aishwarya],
Karanam, S.[Srikrishna],
Srinivasan, B.V.[Balaji Vasan],
Training-Free Color-Style Disentanglement for Constrained
Text-to-Image Synthesis,
AIConGen25(6227-6236)
IEEE DOI
2512
Technological innovation, Image color analysis, Fuses,
Text to image, Transforms, Diffusion models, Covariance matrices
BibRef
Han, S.[Shuhao],
Fan, H.T.[Hao-Tian],
Kong, F.Y.[Fang-Yuan],
Liao, W.J.[Wen-Jie],
Guo, C.[Chunle],
Li, C.Y.[Chong-Yi],
Timofte, R.[Radu],
Li, L.[Liang],
Li, T.[Tao],
Cui, J.H.[Jun-Hui],
Wang, Y.Q.[Yun-Qiu],
Tai, Y.[Yang],
Sun, J.W.[Jing-Wei],
Sun, J.H.[Jian-Hui],
Yue, X.[Xinli],
Wang, T.Y.[Tian-Yi],
Hou, H.[Huan],
Lu, J.[Junda],
Huang, X.Y.[Xin-Yang],
Zhou, Z.[Zitang],
Zhang, Z.J.[Zi-Jian],
Zheng, X.H.[Xu-Hui],
Wu, X.C.[Xue-Cheng],
Peng, C.[Chong],
Cao, X.Z.[Xue-Zhi],
Nguyen-Mau, T.H.[Trong-Hieu],
Le, M.H.[Minh-Hoang],
Le-Phan, M.K.[Minh-Khoa],
Ly, D.N.[Duy-Nam],
Nguyen, H.D.[Hai-Dang],
Tran, M.T.[Minh-Triet],
Lin, Y.[Yukang],
Hong, Y.[Yan],
Song, C.[Chuanbiao],
Li, S.Y.[Si-Yuan],
Lan, J.[Jun],
Zhang, Z.C.[Zhi-Chao],
Li, X.Y.[Xin-Yue],
Sun, W.[Wei],
Zhang, Z.C.[Zi-Cheng],
Li, Y.H.[Yun-Hao],
Liu, X.H.[Xiao-Hong],
Zhai, G.T.[Guang-Tao],
Xu, Z.T.[Zi-Tong],
Duan, H.Y.[Hui-Yu],
Wang, J.R.[Jia-Rui],
Ma, G.[Guangji],
Yang, L.[Liu],
Liu, L.[Lu],
Hu, Q.[Qiang],
Min, X.K.[Xiong-Kuo],
Wang, Z.[Zichuan],
Tang, Z.C.[Zhen-Chen],
Peng, B.[Bo],
Dong, J.[Jing],
Guan, F.B.[Feng-Bin],
Yu, Z.[Zihao],
Lu, Y.T.[Yi-Ting],
Luo, W.[Wei],
Li, X.[Xin],
Lin, M.[Minhao],
Chen, H.F.[Hao-Feng],
He, X.[Xuanxuan],
Xu, K.[Kele],
Xu, Q.[Qisheng],
Gao, Z.J.[Zi-Jian],
Wan, T.J.[Tian-Jiao],
Qiu, B.C.[Bo-Cheng],
Hsu, C.C.[Chih-Chung],
Lee, C.M.[Chia-Ming],
Lin, Y.F.[Yu-Fan],
Yu, B.[Bo],
Wang, Z.[Zehao],
Mu, D.[Da],
Chen, M.X.[Ming-Xiu],
Fang, J.K.[Jun-Kang],
Sun, H.[Huamei],
Zhao, W.D.[Wen-Ding],
Wang, Z.Y.[Zhi-Yu],
Liu, W.[Wang],
Yu, W.K.[Wei-Kang],
Duan, P.H.[Pu-Hong],
Sun, B.[Bin],
Kang, X.D.[Xu-Dong],
Li, S.T.[Shu-Tao],
He, S.[Shuai],
Fu, L.Z.[Ling-Zhi],
Cong, H.[Heng],
Zhang, R.[Rongyu],
He, J.R.[Jia-Rong],
Qiao, Z.S.[Zhi-Shan],
Huang, Y.Q.[Yong-Qing],
Chen, Z.[Zewen],
Pang, Z.[Zhe],
Wang, J.[Juan],
Guo, J.[Jian],
Shao, Z.Z.[Zhi-Zhuo],
Feng, Z.Y.[Zi-Yu],
Li, B.[Bing],
Hu, W.M.[Wei-Ming],
Li, H.[Hesong],
Liu, D.H.[De-Hua],
Liu, Z.M.[Ze-Ming],
Xie, Q.S.[Qing-Song],
Wang, R.C.[Rui-Chen],
Li, Z.H.[Zhi-Hao],
Liang, Y.Q.[Yu-Qi],
Bi, J.Q.[Jian-Qi],
Luo, J.[Jun],
Yang, J.F.[Jun-Feng],
Li, C.[Can],
Fu, J.[Jing],
Xu, H.W.[Hong-Wei],
Long, M.R.[Ming-Rui],
Tang, L.[Lulin],
NTIRE 2025 challenge on Text to Image Generation Model Quality
Assessment,
NTIRE25(1095-1116)
IEEE DOI
2512
Computational modeling, Text to image, Predictive models,
Distortion, Market research, Tires, Quality assessment, Image restoration
BibRef
Hu, T.[Teng],
Zhang, J.N.[Jiang-Ning],
Yi, R.[Ran],
Weng, J.[Jieyu],
Wang, Y.B.[Ya-Biao],
Zeng, X.F.[Xian-Fang],
Xue, Z.[Zhucun],
Ma, L.Z.[Li-Zhuang],
Improving Autoregressive Visual Generation with Cluster-Oriented
Token Prediction,
CVPR25(9351-9360)
IEEE DOI Code:
WWW Link.
2508
Training, Visualization, Correlation, Natural languages,
Clustering algorithms, Predictive models, Prediction algorithms, Indexes
BibRef
Hu, X.X.[Xi-Xi],
Xu, K.[Keyang],
Liu, B.[Bo],
Liu, Q.[Qiang],
Fei, H.L.[Hong-Liang],
AMO Sampler: Enhancing Text Rendering with Overshooting,
CVPR25(13157-13166)
IEEE DOI
2508
Radio frequency, Image quality, Accuracy, Computational modeling,
Noise, Text to image, Ordinary differential equations, image generation
BibRef
Aiello, E.[Emanuele],
Michieli, U.[Umberto],
Valsesia, D.[Diego],
Ozay, M.[Mete],
Magli, E.[Enrico],
DreamCache: Finetuning-Free Lightweight Personalized Image Generation
via Feature Caching,
CVPR25(12480-12489)
IEEE DOI
2508
Training, Adaptation models, Costs, Image synthesis,
Face recognition, Computational modeling, Text to image, caching
BibRef
Talon, D.[Davide],
Girella, F.[Federico],
Liu, Z.Y.[Zi-Yue],
Cristani, M.[Marco],
Wang, Y.M.[Yi-Ming],
Seeing the Abstract: Translating the Abstract Language for Vision
Language Models,
CVPR25(9253-9262)
IEEE DOI
2508
Visualization, Translation, Databases, Prevention and mitigation,
Text to image, Focusing, Motion pictures, Information filtering,
Information integrity
BibRef
Yang, J.[Jian],
Yin, D.C.[Da-Cheng],
Zhou, Y.Z.[Yi-Zhou],
Rao, F.Y.[Feng-Yun],
Zhai, W.[Wei],
Cao, Y.[Yang],
Zha, Z.J.[Zheng-Jun],
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic
Modeling,
CVPR25(7974-7985)
IEEE DOI
2508
Training, Image synthesis, Computational modeling,
Large language models, Noise reduction, Propulsion, Numerical stability
BibRef
Mu, J.[Jiteng],
Vasconcelos, N.M.[Nuno M.],
Wang, X.L.[Xiao-Long],
EditAR: Unified Conditional Generation with Autoregressive Models,
CVPR25(7899-7909)
IEEE DOI Code:
WWW Link.
2508
Adaptation models, Visualization, Image segmentation, Translation,
Image synthesis, Computational modeling, Text to image, Switches
BibRef
Yuan, Y.[Yu],
Wang, X.[Xijun],
Sheng, Y.C.[Yi-Chen],
Chennuri, P.[Prateek],
Zhang, X.G.[Xing-Guang],
Chan, S.[Stanley],
Generative Photography: Scene-Consistent Camera Control for Realistic
Text-to-Image Synthesis,
CVPR25(7920-7930)
IEEE DOI Code:
WWW Link.
2508
Photography, Technological innovation, Image synthesis,
Text to image, Cameras, Generators, Physics, Lenses,
generative models
BibRef
Morshed, M.M.[Mashrur M.],
Boddeti, V.[Vishnu],
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows,
CVPR25(23303-23312)
IEEE DOI
2508
Flow based generative model.
Couplings, Image synthesis, Inverse problems, diversity,
determinantal point processes, flow matching
BibRef
Xie, C.[Cong],
Zou, H.[Han],
Yu, R.Q.[Rui-Qi],
Zhang, Y.[Yan],
Zhan, Z.[Zhenpeng],
SerialGen: Personalized Image Generation by First Standardization
Then Personalization,
CVPR25(2847-2856)
IEEE DOI
2508
Analytical models, Image synthesis, Standardization,
Controllability, personalized image generation, diffusion model,
text to image generation
BibRef
Liu, M.S.[Mu-Shui],
She, D.[Dong],
Pang, J.X.[Jing-Xuan],
Huang, Q.[Qihan],
Ying, J.C.[Jia-Cheng],
He, W.G.[Wang-Gui],
Hou, Y.L.[Yuan-Lei],
Fu, S.[Siming],
TFCustom: Customized Image Generation with Time-Aware Frequency
Feature Guidance,
CVPR25(2714-2723)
IEEE DOI
2508
Time-frequency analysis, Filters, Limiting, Image synthesis,
Noise reduction, Noise, Feature extraction, Synchronization
BibRef
Kumbong, H.[Hermann],
Liu, X.[Xian],
Lin, T.Y.[Tsung-Yi],
Liu, M.Y.[Ming-Yu],
Liu, X.H.[Xi-Hui],
Liu, Z.W.[Zi-Wei],
Fu, D.Y.[Daniel Y.],
Ré, C.[Christopher],
Romero, D.W.[David W.],
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation,
CVPR25(2535-2544)
IEEE DOI
2508
Training, Reactive power, Visualization, Schedules, Image resolution,
Image synthesis, Memory management, Predictive models, image generation
BibRef
Peng, Y.Y.[Yu-Yang],
Xiao, S.[Shishi],
Wu, K.M.[Ke-Ming],
Liao, Q.[Qisheng],
Chen, B.[Bohan],
Lin, K.[Kevin],
Huang, D.Q.[Dan-Qing],
Li, J.[Ji],
Yuan, Y.H.[Yu-Hui],
BizGen: Advancing Article-level Visual Text Rendering for
Infographics Generation,
CVPR25(23615-23624)
IEEE DOI
2508
Visualization, Layout, Text to image, Benchmark testing,
Rendering (computer graphics), Multilingual, Engines, Business
BibRef
Zhao, Z.Q.[Zeng-Qun],
Liu, Z.Q.[Zi-Quan],
Cao, Y.[Yu],
Gong, S.G.[Shao-Gang],
Patras, I.[Ioannis],
AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning
Biased Models with Contextual Synthetic Data,
CVPR25(28748-28758)
IEEE DOI Code:
WWW Link.
2508
Machine learning algorithms, Annotations, Computational modeling,
Prevention and mitigation, Text to image, Diffusion models, Context modeling
BibRef
Didolkar, A.[Aniket],
Zadaianchuk, A.[Andrii],
Awal, R.[Rabiul],
Seitzer, M.[Maximilian],
Gavves, E.[Efstratios],
Agrawal, A.[Aishwarya],
CTRL-O: Language-Controllable Object-Centric Visual Representation
Learning,
CVPR25(29523-29533)
IEEE DOI
2508
Representation learning, Visualization, Image synthesis, Grounding,
Text to image, Controllability
BibRef
Cheng, H.[Hao],
Xiao, E.[Erjia],
Yang, J.Y.[Jia-Yan],
Cao, J.H.[Jia-Hang],
Zhang, Q.[Qiang],
Zhang, J.[Jize],
Xu, K.D.[Kai-Di],
Gu, J.D.[Jin-Dong],
Xu, R.[Renjing],
Not Just Text: Uncovering Vision Modality Typographic Threats in
Image Generation Models,
CVPR25(2997-3007)
IEEE DOI
2508
Image synthesis, Computational modeling, Buildings, Text to image,
Benchmark testing, Robustness, Security
BibRef
Zhang, J.J.[Jin-Jin],
Huang, Q.Y.[Qiu-Yu],
Liu, J.J.[Jun-Jie],
Guo, X.[Xiefan],
Huang, D.[Di],
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent
Diffusion Models,
CVPR25(23464-23473)
IEEE DOI Code:
WWW Link.
2508
Training, Measurement, Image coding, Codes, Image synthesis,
Text to image, Benchmark testing, Diffusion models, diffusion-4k,
latent diffusion models
BibRef
Kong, L.J.[Ling-Jie],
Wu, K.[Kai],
Xu, C.M.[Cheng-Ming],
Hu, X.B.[Xia-Bin],
Han, W.H.[Wen-Hui],
Peng, J.L.[Jin-Long],
Luo, D.H.[Dong-Hao],
Li, M.T.[Meng-Tian],
Zhang, J.N.[Jiang-Ning],
Wang, C.J.[Cheng-Jie],
Fu, Y.W.[Yan-Wei],
CustAny: Customizing Anything from A Single Example,
CVPR25(20916-20925)
IEEE DOI Code:
WWW Link.
2508
Pipelines, Text to image, Dogs, Feature extraction
BibRef
Na, S.[Sanghyeon],
Kim, Y.[Yonggyu],
Lee, H.[Hyunjoon],
Boost Your Human Image Generation Model via Direct Preference
Optimization,
CVPR25(23551-23562)
IEEE DOI
2508
Training, Image quality, Adaptation models, Limiting,
Image synthesis, Social networking (online), Text to image,
text-to-image generation
BibRef
Wang, X.D.[Xu-Dong],
Zhou, X.Y.[Xing-Yi],
Fathi, A.[Alireza],
Darrell, T.J.[Trevor J.],
Schmid, C.[Cordelia],
Visual Lexicon: Rich Image Features in Language Space,
CVPR25(19736-19747)
IEEE DOI
2508
Training, Visualization, Vocabulary, Image synthesis, Semantics,
Pipelines, Text to image, Self-supervised learning, Image reconstruction
BibRef
Zhu, J.P.[Jia-Peng],
Yang, C.[Ceyuan],
Zheng, K.[Kecheng],
Xu, Y.H.[Ying-Hao],
Shi, Z.[Zifan],
Zhang, Y.F.[Yi-Fei],
Chen, Q.F.[Qi-Feng],
Shen, Y.J.[Yu-Jun],
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis,
CVPR25(18411-18423)
IEEE DOI
2508
Training, Adaptation models, Ion radiation effects,
Image resolution, Image synthesis, Text to image, Magnetosphere
BibRef
Yun, T.[Taeyoung],
Zhang, D.[Dinghuai],
Park, J.[Jinkyoo],
Pan, L.[Ling],
Learning to Sample Effective and Diverse Prompts for Text-to-Image
Generation,
CVPR25(23625-23635)
IEEE DOI
2508
Adaptation models, Systematics, Image synthesis, Text to image,
Process control, Reinforcement learning, Probabilistic logic, alignment
BibRef
Um, S.[Soobin],
Ye, J.C.[Jong Chul],
Minority-Focused Text-to-Image Generation via Prompt Optimization,
CVPR25(20926-20936)
IEEE DOI Code:
WWW Link.
2508
Codes, Semantics, Text to image, Diffusion models, Linear programming,
Data augmentation, Generators, Optimization, minority generation
BibRef
Han, W.[Woojung],
Lee, Y.[Yeonkyung],
Kim, C.[Chanyoung],
Park, K.[Kwanghyun],
Hwang, S.J.[Seong Jae],
Spatial Transport Optimization by Repositioning Attention Map for
Training-Free Text-to-Image Synthesis,
CVPR25(18401-18410)
IEEE DOI
2508
Storms, Image synthesis, Noise reduction, Text to image,
Spatial coherence, Focusing, Benchmark testing, Cost function,
text-to-image synthesis
BibRef
Xing, X.Y.[Xiao-Ying],
Saha, A.[Avinab],
He, J.F.[Jun-Feng],
Hao, S.[Susan],
Vicol, P.[Paul],
Ryu, M.[Moonkyung],
Li, G.[Gang],
Singla, S.[Sahil],
Young, S.[Sarah],
Li, Y.X.[Yin-Xiao],
Yang, F.[Feng],
Ramachandran, D.[Deepak],
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation,
CVPR25(18486-18496)
IEEE DOI
2508
Image quality, Degradation, Training, Location awareness,
Computational modeling, Current measurement, Text to image,
learning from human preference
BibRef
Liang, D.[Dong],
Jia, J.Y.[Jin-Yuan],
Liu, Y.H.[Yu-Hao],
Ke, Z.H.[Zhang-Han],
Fu, H.B.[Hong-Bo],
Lau, R.W.H.[Rynson W.H.],
VODiff: Controlling Object Visibility Order in Text-to-Image
Generation,
CVPR25(18379-18389)
IEEE DOI
2508
Image quality, Image synthesis, Layout, Noise reduction, Merging,
Text to image, Transforms, Optimization, Photorealistic images
BibRef
Jo, K.[Kyungmin],
Yun, J.[Jooyeol],
Choo, J.[Jaegul],
Devil is in the Detail: Towards Injecting Fine Details of Image
Prompt in Image Generation via Conflict-free Guidance and Stratified
Attention,
CVPR25(23595-23603)
IEEE DOI
2508
Training, Hands, Visualization, Image resolution, Image synthesis,
Scalability, Text to image, Diffusion models, Signal resolution
BibRef
Zhang, Z.C.[Zi-Cheng],
Kou, T.C.[Teng-Chuan],
Wang, S.[Shushi],
Li, C.Y.[Chun-Yi],
Sun, W.[Wei],
Wang, W.[Wei],
Li, X.Y.[Xiao-Yu],
Wang, Z.Y.[Zong-Yu],
Cao, X.Z.[Xue-Zhi],
Min, X.K.[Xiong-Kuo],
Liu, X.H.[Xiao-Hong],
Zhai, G.T.[Guang-Tao],
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for
Text-to-Vision Content,
CVPR25(10621-10631)
IEEE DOI Code:
WWW Link.
2508
Visualization, Annotations, Computational modeling,
Evaluation models, Text to image, Fasteners, Reliability,
Context modeling
BibRef
Jung, S.[Sangwon],
Oesterling, A.[Alex],
Verdun, C.M.[Claudio Mayrink],
Vithana, S.[Sajani],
Moon, T.[Taesup],
Calmon, F.P.[Flavio P.],
Multi-Group Proportional Representation for Text-to-Image Models,
CVPR25(23744-23754)
IEEE DOI
2508
Measurement, Training, Image synthesis, Computational modeling,
Text to image, Control systems, Context modeling, bias
BibRef
Xiao, S.T.[Shi-Tao],
Wang, Y.Z.[Yue-Ze],
Zhou, J.J.[Jun-Jie],
Yuan, H.[Huaying],
Xing, X.[Xingrun],
Yan, R.[Ruiran],
Li, C.F.[Chao-Fan],
Wang, S.T.[Shu-Ting],
Huang, T.J.[Tie-Jun],
Liu, Z.[Zheng],
OmniGen: Unified Image Generation,
CVPR25(13294-13304)
IEEE DOI Code:
WWW Link.
2508
Human computer interaction, Image synthesis,
Computational modeling, Large language models, Text to image,
multimodal understanding
BibRef
Chen, J.[Jierun],
Hu, D.T.[Dong-Ting],
Huang, X.J.[Xi-Jie],
Coskun, H.[Huseyin],
Sahni, A.[Arpit],
Gupta, A.[Aarush],
Goyal, A.[Anujraaj],
Lahiri, D.[Dishani],
Singh, R.[Rajesh],
Idelbayev, Y.[Yerlan],
Cao, J.L.[Jun-Li],
Li, Y.Y.[Yan-Yu],
Cheng, K.T.[Kwang-Ting],
Chan, S.H.G.[S.H. Gary],
Gong, M.M.[Ming-Ming],
Tulyakov, S.[Sergey],
Kag, A.[Anil],
Xu, Y.[Yanwu],
Ren, J.[Jian],
SnapGen: Taming High-Resolution Text-To-Image Models for Mobile
Devices with Efficient Architectures and Training,
CVPR25(7997-8008)
IEEE DOI
2508
Training, Runtime, Computational modeling, Text to image,
Network architecture, Diffusion models, Mobile handsets, Faces,
mobile text-to-image models
BibRef
Franchi, G.[Gianni],
Belkhir, N.[Nacim],
Trong, D.N.[Dat Nguyen],
Xia, G.X.[Guo-Xuan],
Pilzer, A.[Andrea],
Towards Understanding and Quantifying Uncertainty for Text-to-Image
Generation,
CVPR25(8062-8072)
IEEE DOI Code:
WWW Link.
2508
Adaptation models, Uncertainty, Accuracy, Semantics,
Measurement uncertainty, Text to image, Estimation, Training data,
trustworthy ai
BibRef
Jia, C.Y.[Cheng-You],
Xia, C.L.[Chang-Liang],
Dang, Z.H.[Zhuo-Hang],
Wu, W.J.[Wei-Jia],
Qian, H.W.[Hang-Wei],
Luo, M.[Minnan],
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting,
CVPR25(13284-13293)
IEEE DOI
2508
Image quality, Automation, Uncertainty, Computational modeling,
Face recognition, Text to image, Benchmark testing, Data models,
multimodal large language models
BibRef
Baumann, S.A.[Stefan Andreas],
Krause, F.[Felix],
Neumayr, M.[Michael],
Stracke, N.[Nick],
Sevi, M.[Melvin],
Hu, V.T.[Vincent Tao],
Ommer, B.[Björn],
Continuous, Subject-Specific Attribute Control in T2I Models by
Identifying Semantic Directions,
CVPR25(13231-13241)
IEEE DOI
2508
Location awareness, Costs, Image synthesis, Semantics, Modulation,
Text to image, Process control, Aerospace electronics
BibRef
Binyamin, L.[Lital],
Tewel, Y.[Yoad],
Segev, H.[Hilit],
Hirsch, E.[Eran],
Rassin, R.[Royi],
Chechik, G.[Gal],
Make It Count: Text-to-Image Generation with an Accurate Number of
Objects,
CVPR25(13242-13251)
IEEE DOI Code:
WWW Link.
2508
Training, Accuracy, Shape, Layout, Noise reduction, Text to image,
Predictive models, Diffusion models, Standards, text-to-image,
generative-models
BibRef
Li, F.F.[Fei-Fei],
Zhang, M.[Mi],
Sun, Y.M.[Yi-Ming],
Yang, M.[Min],
Detect-and-Guide: Self-regulation of Diffusion Models for Safe
Text-to-Image Generation via Guideline Token Optimization,
CVPR25(13252-13262)
IEEE DOI
2508
Adaptation models, Visualization, Prevention and mitigation,
Text to image, Diffusion models, Safety, Trajectory, Usability,
safe generation
BibRef
Wang, Z.[Zihao],
Wei, Y.X.[Yu-Xiang],
Li, F.[Fan],
Pei, R.[Renjing],
Xu, H.[Hang],
Zuo, W.M.[Wang-Meng],
ACE: Anti-Editing Concept Erasure in Text-to-Image Models,
CVPR25(23505-23515)
IEEE DOI Code:
WWW Link.
2508
Training, Filters, Filtration, Noise, Text to image,
Stochastic processes, Production, Predictive models, IP networks,
image generation
BibRef
Guo, Z.[Zirun],
Jin, T.[Tao],
ConceptGuard: Continual Personalized Text-to-Image Generation with
Forgetting and Confusion Mitigation,
CVPR25(2945-2954)
IEEE DOI
2508
Prevention and mitigation, Text to image, diffusion,
personalization, continual learning, image generation
BibRef
Azam, B.[Basim],
Akhtar, N.[Naveed],
Plug-and-Play Interpretable Responsible Text-to-Image Generation via
Dual-Space Multi-facet Concept Control,
CVPR25(2976-2985)
IEEE DOI Code:
WWW Link.
2508
Ethics, Image synthesis, Pipelines, Semantics, Text to image,
Aerospace electronics, Diffusion models, Tuning, Standards,
responsible image generation
BibRef
Zhao, R.[Rui],
Mao, W.J.[Wei-Jia],
Shou, M.Z.[Mike Zheng],
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in
Multimodal Cycles,
CVPR25(2835-2846)
IEEE DOI Code:
WWW Link.
2508
Training, Adaptation models, Codes, Computational modeling,
Text to image, Data models, Optimization
BibRef
Zhou, P.F.[Peng-Fei],
Peng, X.P.[Xiao-Peng],
Song, J.J.[Jia-Jun],
Li, C.[Chuanhao],
Xu, Z.[Zhaopan],
Yang, Y.[Yue],
Guo, Z.[Ziyao],
Zhang, H.[Hao],
Lin, Y.Q.[Yu-Qi],
He, Y.F.[Ye-Fei],
Zhao, L.[Lirui],
Liu, S.[Shuo],
Li, T.H.[Tian-Hua],
Xie, Y.X.[Yu-Xuan],
Chang, X.J.[Xiao-Jun],
Qiao, Y.[Yu],
Shao, W.Q.[Wen-Qi],
Zhang, K.[Kaipeng],
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved
Image-Text Generation,
CVPR25(56-66)
IEEE DOI
2508
Training, Visualization, Large language models, Pipelines,
Benchmark testing, Brain modeling, Data models,
mllm-as-a-judge
BibRef
Qiu, Y.N.[Yu-Ning],
Wang, A.D.[An-Dong],
Li, C.[Chao],
Huang, H.N.[Hao-Nan],
Zhou, G.X.[Guo-Xu],
Zhao, Q.[Qibin],
STEPS: Sequential Probability Tensor Estimation for Text-to-Image
Hard Prompt Search,
CVPR25(28640-28650)
IEEE DOI
2508
Visualization, Tensors, Quantization (signal), Estimation,
Text to image, Diffusion models, Search problems,
probability estimation
BibRef
Poesina, E.[Eduard],
Costache, A.V.[Adriana Valentina],
Chifu, A.G.[Adrian-Gabriel],
Mothe, J.[Josiane],
Ionescu, R.T.[Radu Tudor],
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query
Performance Prediction,
CVPR25(28651-28661)
IEEE DOI Code:
WWW Link.
2508
Codes, Image synthesis, Annotations, Image retrieval, Text to image,
Benchmark testing, Information retrieval, Diffusion models,
text-to-image retrieval
BibRef
Wang, A.Z.[Andrew Z.],
Ge, S.W.[Song-Wei],
Karras, T.[Tero],
Liu, M.Y.[Ming-Yu],
Balaji, Y.[Yogesh],
A Comprehensive Study of Decoder-Only LLMs for Text-to-Image
Generation,
CVPR25(28575-28585)
IEEE DOI
2508
Training, Analytical models, Computational modeling, Scalability,
Large language models, Pipelines, Semantics, Text to image
BibRef
Lakhanpal, S.[Sanyam],
Chopra, S.[Shivang],
Jain, V.[Vinija],
Chadha, A.[Aman],
Luo, M.[Man],
Refining Text-to-Image Generation: Towards Accurate Training-Free
Glyph-Enhanced Image Generation,
WACV25(4372-4381)
IEEE DOI Code:
WWW Link.
2505
Measurement, Visualization, Accuracy, Image synthesis,
Optical character recognition, Layout, Refining, Text to image,
layout optimization
BibRef
Agarwal, A.[Aishwarya],
Karanam, S.[Srikrishna],
Srinivasan, B.V.[Balaji Vasan],
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image
Models,
WACV25(4882-4890)
IEEE DOI
2505
Surveys, Protocols, Semantics, Noise, Text to image, Predictive models,
Diffusion models, Encoding, Vectors, Image reconstruction
BibRef
Agarwal, A.[Aishwarya],
Karanam, S.[Srikrishna],
Shukla, T.[Tripti],
Srinivasan, B.V.[Balaji Vasan],
An Image is Worth Multiple Words: Multi-Attribute Inversion for
Constrained Text-To-Image Synthesis,
WACV25(6053-6062)
IEEE DOI
2505
Image color analysis, Image synthesis, Layout, Noise reduction,
Text to image, Aerospace electronics, Diffusion models
BibRef
Corneanu, C.A.[Ciprian A.],
Feng, Q.L.[Qian-Li],
Martinez, A.M.[Aleix M.],
Structured Human Assessment of Text-to-Image Generative Models,
WACV25(4481-4490)
IEEE DOI
2505
Image quality, Correlation, Image synthesis, Grounding, Semantics,
Text to image, Genomics, Bioinformatics
BibRef
Choi, H.[Hongsuk],
Kasahara, I.[Isaac],
Engin, S.[Selim],
Graule, M.A.[Moritz A.],
Chavan-Dafle, N.[Nikhil],
Isler, V.[Volkan],
FineControlNet: Fine-level Text Control for Image Generation with
Spatially Aligned Text Control Injection,
WACV25(3975-3984)
IEEE DOI
2505
Measurement, Image quality, Visualization, Image synthesis,
Statistical analysis, Image edge detection, Text to image, human pose
BibRef
Ban, Y.H.[Yuan-Hao],
Wang, R.C.[Ruo-Chen],
Zhou, T.Y.[Tian-Yi],
Cheng, M.[Minhao],
Gong, B.Q.[Bo-Qing],
Hsieh, C.J.[Cho-Jui],
Understanding the Impact of Negative Prompts: When and How Do They Take
Effect?,
ECCV24(LXXXIX: 190-206).
Springer DOI
2412
specify what to exclude from the generated images
BibRef
Stracke, N.[Nick],
Baumann, S.A.[Stefan Andreas],
Susskind, J.[Joshua],
Bautista, M.A.[Miguel Angel],
Ommer, B.[Björn],
CTRLorALTer: Conditional LorALTer for Efficient 0-shot Control and
Altering of T2I Models,
ECCV24(LXXXVIII: 87-103).
Springer DOI
2412
BibRef
Hemmat, R.A.[Reyhane Askari],
Hall, M.[Melissa],
Sun, A.[Alicia],
Ross, C.[Candace],
Drozdzal, M.[Michal],
Romero-Soriano, A.[Adriana],
Improving Geo-diversity of Generated Images with Contextualized Vendi
Score Guidance,
ECCV24(LXXXVII: 213-229).
Springer DOI
2412
Code:
WWW Link.
BibRef
Li, P.Z.[Peng-Zhi],
Nie, Q.[Qiang],
Chen, Y.[Ying],
Jiang, X.[Xi],
Wu, K.[Kai],
Lin, Y.[Yuhuan],
Liu, Y.[Yong],
Peng, J.L.[Jin-Long],
Wang, C.J.[Cheng-Jie],
Zheng, F.[Feng],
Tuning-Free Image Customization with Image and Text Guidance,
ECCV24(LXXVI: 233-250).
Springer DOI
2412
Project:
WWW Link. Guided customation.
BibRef
Xue, X.T.[Xiang-Tian],
Wu, J.S.[Jia-Song],
Kong, Y.Y.[You-Yong],
Senhadji, L.[Lotfi],
Shu, H.Z.[Hua-Zhong],
ST-LDM: A Universal Framework for Text-grounded Object Generation in
Real Images,
ECCV24(XLVI: 145-162).
Springer DOI
2412
BibRef
Wu, Z.F.[Zhi-Fan],
Huang, L.H.[Liang-Hua],
Wang, W.[Wei],
Wei, Y.H.[Yan-Heng],
Liu, Y.[Yu],
MultiGen: Zero-Shot Image Generation from Multi-Modal Prompts,
ECCV24(VIII: 297-313).
Springer DOI
2412
BibRef
Sun, Q.[Qi],
Zhou, H.[Hang],
Zhou, W.G.[Wen-Gang],
Li, L.[Li],
Li, H.Q.[Hou-Qiang],
Forest2seq: Revitalizing Order Prior for Sequential Indoor Scene
Synthesis,
ECCV24(XXV: 251-268).
Springer DOI
2412
BibRef
Wei, Y.X.[Yu-Xiang],
Ji, Z.L.[Zhi-Ling],
Bai, J.F.[Jin-Feng],
Zhang, H.Z.[Hong-Zhi],
Zhang, L.[Lei],
Zuo, W.M.[Wang-Meng],
Masterweaver: Taming Editability and Face Identity for Personalized
Text-to-image Generation,
ECCV24(LI: 252-271).
Springer DOI
2412
BibRef
Lee, S.H.[Seung Hyun],
Li, Y.X.[Yin-Xiao],
Ke, J.J.[Jun-Jie],
Yoo, I.[Innfarn],
Zhang, H.[Han],
Yu, J.H.[Jia-Hui],
Wang, Q.F.[Qi-Fei],
Deng, F.[Fei],
Entis, G.[Glenn],
He, J.F.[Jun-Feng],
Li, G.[Gang],
Kim, S.[Sangpil],
Essa, I.[Irfan],
Yang, F.[Feng],
Parrot: Pareto-optimal Multi-reward Reinforcement Learning Framework
for Text-to-image Generation,
ECCV24(XXXVIII: 462-478).
Springer DOI
2412
BibRef
Zheng, A.Y.J.[Amber Yi-Jia],
Yeh, R.A.[Raymond A.],
IMMA: Immunizing Text-to-image Models Against Malicious Adaptation,
ECCV24(XXXIX: 458-475).
Springer DOI
2412
BibRef
Chatterjee, A.[Agneet],
Ben Melech-Stan, G.[Gabriela],
Aflalo, E.[Estelle],
Paul, S.[Sayak],
Ghosh, D.[Dhruba],
Gokhale, T.[Tejas],
Schmidt, L.[Ludwig],
Hajishirzi, H.[Hannaneh],
Lal, V.[Vasudev],
Baral, C.[Chitta],
Yang, Y.Z.[Ye-Zhou],
Getting it Right: Improving Spatial Consistency in Text-to-image Models,
ECCV24(XXII: 204-222).
Springer DOI
2412
BibRef
Liu, R.T.[Run-Tao],
Khakzar, A.[Ashkan],
Gu, J.D.[Jin-Dong],
Chen, Q.F.[Qi-Feng],
Torr, P.H.S.[Philip H.S.],
Pizzati, F.[Fabio],
Latent Guard: A Safety Framework for Text-to-image Generation,
ECCV24(XXVI: 93-109).
Springer DOI
2412
BibRef
Wei, F.[Fanyue],
Zeng, W.[Wei],
Li, Z.Y.[Zhen-Yang],
Yin, D.W.[Da-Wei],
Duan, L.X.[Li-Xin],
Li, W.[Wen],
Powerful and Flexible: Personalized Text-to-image Generation via
Reinforcement Learning,
ECCV24(XXVII: 394-410).
Springer DOI
2412
BibRef
Li, H.T.[Han-Ting],
Niu, H.J.[Hong-Jing],
Zhao, F.[Feng],
Stable Preference: Redefining Training Paradigm of Human Preference
Model for Text-to-image Synthesis,
ECCV24(XXVIII: 250-266).
Springer DOI
2412
BibRef
Gal, R.[Rinon],
Lichter, O.[Or],
Richardson, E.[Elad],
Patashnik, O.[Or],
Bermano, A.H.[Amit H.],
Chechik, G.[Gal],
Cohen-Or, D.[Daniel],
LCM-Lookahead for Encoder-based Text-to-image Personalization,
ECCV24(XIV: 322-340).
Springer DOI
2412
BibRef
Dahary, O.[Omer],
Patashnik, O.[Or],
Aberman, K.[Kfir],
Cohen-Or, D.[Daniel],
Be Yourself: Bounded Attention for Multi-subject Text-to-image
Generation,
ECCV24(XIV: 432-448).
Springer DOI
2412
BibRef
Xiong, P.X.[Pei-Xi],
Kozuch, M.[Michael],
Jain, N.[Nilesh],
Textual-visual Logic Challenge:
Understanding and Reasoning in Text-to-image Generation,
ECCV24(V: 318-334).
Springer DOI
2412
BibRef
Sun, Y.[Yanan],
Liu, Y.C.[Yan-Chen],
Tang, Y.H.[Yin-Hao],
Pei, W.J.[Wen-Jie],
Chen, K.[Kai],
Anycontrol: Create Your Artwork with Versatile Control on Text-to-image
Generation,
ECCV24(XI: 92-109).
Springer DOI
2412
BibRef
Lu, C.Y.[Chen-Yi],
Agarwal, S.[Shubham],
Tanjim, M.M.[Md Mehrab],
Mahadik, K.[Kanak],
Rao, A.[Anup],
Mitra, S.[Subrata],
Saini, S.K.[Shiv Kumar],
Bagchi, S.[Saurabh],
Chaterji, S.[Somali],
Recon: Training-free Acceleration for Text-to-image Synthesis with
Retrieval of Concept Prompt Trajectories,
ECCV24(LIX: 288-306).
Springer DOI
2412
BibRef
Zhao, S.H.[Shi-Hao],
Hao, S.[Shaozhe],
Zi, B.[Bojia],
Xu, H.Z.[Huai-Zhe],
Wong, K.Y.K.[Kwan-Yee K.],
Bridging Different Language Models and Generative Vision Models for
Text-to-image Generation,
ECCV24(LXXXI: 70-86).
Springer DOI
2412
BibRef
Tan, Z.Y.[Zhi-Yu],
Yang, M.P.[Meng-Ping],
Qin, L.[Luozheng],
Yang, H.[Hao],
Qian, Y.[Ye],
Zhou, Q.[Qiang],
Zhang, C.[Cheng],
Li, H.[Hao],
An Empirical Study and Analysis of Text-to-image Generation Using Large
Language Model-powered Textual Representation,
ECCV24(LXXX: 472-489).
Springer DOI
2412
BibRef
Chinchure, A.[Aditya],
Shukla, P.[Pushkar],
Bhatt, G.[Gaurav],
Salij, K.[Kiri],
Hosanagar, K.[Kartik],
Sigal, L.[Leonid],
Turk, M.[Matthew],
Tibet: Identifying and Evaluating Biases in Text-to-image Generative
Models,
ECCV24(LXXIX: 429-446).
Springer DOI
2412
BibRef
Mittal, S.[Surbhi],
Sudan, A.[Arnav],
Vatsa, M.[Mayank],
Singh, R.[Richa],
Glaser, T.[Tamar],
Hassner, T.[Tal],
Navigating Text-to-image Generative Bias Across Indic Languages,
ECCV24(LXXXVIII: 53-67).
Springer DOI
2412
BibRef
Chang, Y.S.[Ying-Shan],
Zhang, Y.[Yasi],
Fang, Z.Y.[Zhi-Yuan],
Wu, Y.N.[Ying Nian],
Bisk, Y.[Yonatan],
Gao, F.[Feng],
Skews in the Phenomenon Space Hinder Generalization in Text-to-image
Generation,
ECCV24(LXXXVII: 422-439).
Springer DOI
2412
BibRef
Yang, Y.Q.[Yu-Qing],
Moremada, C.[Charuka],
Deligiannis, N.[Nikos],
On the Detection of Images Generated from Text,
ICIP24(3792-3798)
IEEE DOI
2411
Resistance, Visualization, Computational modeling,
Perturbation methods, Noise, Text to image, Detectors, robustness
BibRef
Liu, Z.X.[Zhi-Xuan],
Schaldenbrand, P.[Peter],
Okogwu, B.C.[Beverley-Claire],
Peng, W.X.[Wen-Xuan],
Peng, W.X.[Wen-Xuan],
Yun, Y.[Youngsik],
Hundt, A.[Andrew],
Kim, J.[Jihie],
Oh, J.[Jean],
SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation,
CVPR24(10822-10832)
IEEE DOI Code:
WWW Link.
2410
I.e. cultural biases.
Measurement, Surveys, Image synthesis, Generative AI, Media,
Copyright protection, Data models, Image Synthesis,
Computer Vision for Social Good
BibRef
Zhang, Y.X.[Yu-Xuan],
Song, Y.R.[Yi-Ren],
Liu, J.M.[Jia-Ming],
Wang, R.[Rui],
Yu, J.P.[Jin-Peng],
Tang, H.[Hao],
Li, H.X.[Hua-Xia],
Tang, X.[Xu],
Hu, Y.[Yao],
Pan, H.[Han],
Jing, Z.L.[Zhong-Liang],
SSR-Encoder: Encoding Selective Subject Representation for
Subject-Driven Generation,
CVPR24(8069-8078)
IEEE DOI
2410
Code:
WWW Link. Training, Adaptation models, Image coding, Image synthesis,
Ecosystems, Feature extraction
BibRef
Lee, J.[Jumin],
Lee, S.[Sebin],
Jo, C.[Changho],
Im, W.B.[Woo-Bin],
Seon, J.[Juhyeong],
Yoon, S.E.[Sung-Eui],
SemCity: Semantic Scene Generation with Triplane Diffusion,
CVPR24(28337-28347)
IEEE DOI Code:
WWW Link.
2410
Roads, Computational modeling, Semantics, Diffusion processes,
Diffusion models, diffusion models, scene generation,
semantic generation
BibRef
Raistrick, A.[Alexander],
Mei, L.J.[Ling-Jie],
Kayan, K.[Karhan],
Yan, D.[David],
Zuo, Y.M.[Yi-Ming],
Han, B.[Beining],
Wen, H.Y.[Hong-Yu],
Parakh, M.[Meenal],
Alexandropoulos, S.[Stamatis],
Lipson, L.[Lahav],
Ma, Z.[Zeyu],
Deng, J.[Jia],
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural
Generation,
CVPR24(21783-21794)
IEEE DOI
2410
Training, Procedural generation, Licenses, Real-time systems,
Libraries, Generators, Procedural Generation, Indoor, Dataset,
Robotics
BibRef
Ji, P.L.[Peng-Liang],
Liu, J.C.[Jun-Chen],
TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with
Generative Foundation Models,
GenerativeFM24(5302-5313)
IEEE DOI
2410
Measurement, Accuracy, Semantics, Benchmark testing, Cognition,
text-to-visual evaluation, benchmark evaluation, multimodal,
generative foundation models
BibRef
Zhao, S.Y.[Shi-Yu],
Zhao, L.[Long],
Kumar, B.G.V.[B.G. Vijay],
Suh, Y.M.[Yu-Min],
Metaxas, D.N.[Dimitris N.],
Chandraker, M.[Manmohan],
Schulter, S.[Samuel],
Generating Enhanced Negatives for Training Language-Based Object
Detectors,
CVPR24(13592-13602)
IEEE DOI Code:
WWW Link.
2410
Training, Vocabulary, Accuracy, Training data, Text to image,
Detectors, Benchmark testing, open-vocabulary object detection,
negative example mining
BibRef
Fan, L.J.[Li-Jie],
Chen, K.[Kaifeng],
Krishnan, D.[Dilip],
Katabi, D.[Dina],
Isola, P.[Phillip],
Tian, Y.L.[Yong-Long],
Scaling Laws of Synthetic Images for Model Training ... for Now,
CVPR24(7382-7392)
IEEE DOI
2410
Training, Computational modeling, Machine vision, Text to image,
Training data, Data models
BibRef
Qiao, P.C.[Peng-Chong],
Shang, L.[Lei],
Liu, C.[Chang],
Sun, B.[Baigui],
Ji, X.Y.[Xiang-Yang],
Chen, J.[Jie],
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes
for One-Shot Subject-Driven Generation,
CVPR24(7215-7224)
IEEE DOI
2410
Codes, Object oriented modeling, Semantics, Buildings, Text to image,
subject-driven generation, derived class
BibRef
Zhu, J.Y.[Jia-Yi],
Guo, Q.[Qing],
Juefei-Xu, F.[Felix],
Huang, Y.H.[Yi-Hao],
Liu, Y.[Yang],
Pu, G.[Geguang],
Cosalpure: Learning Concept from Group Images for Robust Co-Saliency
Detection,
CVPR24(3669-3678)
IEEE DOI Code:
WWW Link.
2410
Technological innovation, Purification, Perturbation methods,
Noise, Semantics, Text to image, Object detection
BibRef
Chan, K.C.K.[Kelvin C.K.],
Zhao, Y.[Yang],
Jia, X.H.[Xu-Hui],
Yang, M.H.[Ming-Hsuan],
Wang, H.[Huisheng],
Improving Subject-Driven Image Synthesis with Subject-Agnostic
Guidance,
CVPR24(6733-6742)
IEEE DOI
2410
Training, Codes, Image synthesis, Text to image
BibRef
Haji-Ali, M.[Moayed],
Balakrishnan, G.[Guha],
Ordonez, V.[Vicente],
ElasticDiffusion: Training-Free Arbitrary Size Image Generation
Through Global-Local Content Separation,
CVPR24(6603-6612)
IEEE DOI Code:
WWW Link.
2410
Image synthesis, Text to image, Coherence, Diffusion models,
Decoding, Trajectory, Text2Image, Diffusion Models, Image Generation,
Stablediffusion
BibRef
Zhang, C.[Cheng],
Wu, Q.Y.[Qian-Yi],
Gambardella, C.C.[Camilo Cruz],
Huang, X.S.[Xiao-Shui],
Phung, D.[Dinh],
Ouyang, W.L.[Wan-Li],
Cai, J.F.[Jian-Fei],
Taming Stable Diffusion for Text to 360° Panorama Image Generation,
CVPR24(6347-6357)
IEEE DOI
2410
Image synthesis, Layout, Noise reduction,
Diffusion models, Distortion
BibRef
Hu, H.X.[He-Xiang],
Chan, K.C.K.[Kelvin C.K.],
Su, Y.C.[Yu-Chuan],
Chen, W.[Wenhu],
Li, Y.D.[Yan-Dong],
Sohn, K.[Kihyuk],
Zhao, Y.[Yang],
Ben, X.[Xue],
Gong, B.Q.[Bo-Qing],
Cohen, W.[William],
Chang, M.W.[Ming-Wei],
Jia, X.H.[Xu-Hui],
Instruct-Imagen: Image Generation with Multi-modal Instruction,
CVPR24(4754-4763)
IEEE DOI
2410
Training, Adaptation models, Image synthesis, Image edge detection,
Natural languages, Text to image, Diffusion Model,
Generalization to Unseen Tasks
BibRef
Kondapaneni, N.[Neehar],
Marks, M.[Markus],
Knott, M.[Manuel],
Guimaraes, R.[Rogerio],
Perona, P.[Pietro],
Text-Image Alignment for Diffusion-Based Perception,
CVPR24(13883-13893)
IEEE DOI
2410
Visualization, Codes, Semantic segmentation,
Computational modeling, Text to image, Estimation, ADE20K
BibRef
Qiao, R.[Runqi],
Yang, L.[Lan],
Pang, K.Y.[Kai-Yue],
Zhang, H.G.[Hong-Gang],
Making Visual Sense of Oracle Bones for You and Me,
CVPR24(12656-12665)
IEEE DOI Code:
WWW Link.
2410
Training, Heart, Visualization, Semantics, Text to image, Manuals, Bones
BibRef
Shrestha, R.[Robik],
Zou, Y.[Yang],
Chen, Q.Y.[Qiu-Yu],
Li, Z.H.[Zhi-Heng],
Xie, Y.S.[Yu-Sheng],
Deng, S.Q.[Si-Qi],
FairRAG: Fair Human Generation via Fair Retrieval Augmentation,
CVPR24(11996-12005)
IEEE DOI
2410
Visualization, Image synthesis, Image databases, Computational modeling,
training data, Text to image, bias, fairness, generative-ai
BibRef
Jayasumana, S.[Sadeep],
Ramalingam, S.[Srikumar],
Veit, A.[Andreas],
Glasner, D.[Daniel],
Chakrabarti, A.[Ayan],
Kumar, S.[Sanjiv],
Rethinking FID: Towards a Better Evaluation Metric for Image
Generation,
CVPR24(9307-9315)
IEEE DOI Code:
WWW Link.
2410
Measurement, Machine learning algorithms, Image synthesis,
Text to image, Machine learning, Probability distribution, CMMD
BibRef
Wu, Y.[You],
Liu, K.[Kean],
Mi, X.Y.[Xiao-Yue],
Tang, F.[Fan],
Cao, J.[Juan],
Li, J.T.[Jin-Tao],
U-VAP: User-specified Visual Appearance Personalization via Decoupled
Self Augmentation,
CVPR24(9482-9491)
IEEE DOI Code:
WWW Link.
2410
Visualization, Semantics, Refining, Text to image,
Aerospace electronics, Controllability
BibRef
Po, R.[Ryan],
Yang, G.[Guandao],
Aberman, K.[Kfir],
Wetzstein, G.[Gordon],
Orthogonal Adaptation for Modular Customization of Diffusion Models,
CVPR24(7964-7973)
IEEE DOI
2410
Adaptation models, Computational modeling, Scalability, Merging,
Text to image, Interference
BibRef
Bahmani, S.[Sherwin],
Skorokhodov, I.[Ivan],
Rong, V.[Victor],
Wetzstein, G.[Gordon],
Guibas, L.J.[Leonidas J.],
Wonka, P.[Peter],
Tulyakov, S.[Sergey],
Park, J.J.[Jeong Joon],
Tagliasacchi, A.[Andrea],
Lindell, D.B.[David B.],
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling,
CVPR24(7996-8006)
IEEE DOI
2410
Measurement, Training, Solid modeling, Dynamics, Text to image,
Hybrid power systems
BibRef
Zhang, S.[Sixian],
Wang, B.[Bohan],
Wu, J.Q.[Jun-Qiang],
Li, Y.[Yan],
Gao, T.T.[Ting-Ting],
Zhang, D.[Di],
Wang, Z.Y.[Zhong-Yuan],
Learning Multi-Dimensional Human Preference for Text-to-Image
Generation,
CVPR24(8018-8027)
IEEE DOI
2410
Measurement, Image synthesis, Annotations, Computational modeling,
Semantics, Text to image, Text-to-image generation, Evaluation
BibRef
Zhang, Y.M.[Yi-Ming],
Xing, Z.[Zhening],
Zeng, Y.H.[Yan-Hong],
Fang, Y.Q.[You-Qing],
Chen, K.[Kai],
PIA: Your Personalized Image Animator via Plug-and-Play Modules in
Text-to-Image Models,
CVPR24(7747-7756)
IEEE DOI
2410
Text to image, Benchmark testing, Animation, Controllability,
Tuning
BibRef
Huang, S.[Siteng],
Gong, B.[Biao],
Feng, Y.T.[Yu-Tong],
Chen, X.[Xi],
Fu, Y.Q.[Yu-Qian],
Liu, Y.[Yu],
Wang, D.L.[Dong-Lin],
Learning Disentangled Identifiers for Action-Customized Text-to-Image
Generation,
CVPR24(7797-7806)
IEEE DOI Code:
WWW Link.
2410
Animals, Semantics, Text to image, Feature extraction,
Contamination, text-to-image generation,
Action-Disentangled Identifier
BibRef
Chen, Z.J.[Zi-Jie],
Zhang, L.C.[Li-Chao],
Weng, F.S.[Fang-Sheng],
Pan, L.[Lili],
Lan, Z.Z.[Zhen-Zhong],
Tailored Visions: Enhancing Text-to-Image Generation with
Personalized Prompt Rewriting,
CVPR24(7727-7736)
IEEE DOI Code:
WWW Link.
2410
Visualization, Codes, Text to image
BibRef
Qu, L.G.[Lei-Gang],
Wang, W.J.[Wen-Jie],
Li, Y.Q.[Yong-Qi],
Zhang, H.W.[Han-Wang],
Nie, L.Q.[Li-Qiang],
Chua, T.S.[Tat-Seng],
Discriminative Probing and Tuning for Text-to-Image Generation,
CVPR24(7434-7444)
IEEE DOI Code:
WWW Link.
2410
Adaptation models, Large language models, Face recognition,
Computational modeling, Layout, Text to image
BibRef
Ruiz, N.[Nataniel],
Li, Y.Z.[Yuan-Zhen],
Jampani, V.[Varun],
Wei, W.[Wei],
Hou, T.B.[Ting-Bo],
Pritch, Y.[Yael],
Wadhwa, N.[Neal],
Rubinstein, M.[Michael],
Aberman, K.[Kfir],
HyperDreamBooth: HyperNetworks for Fast Personalization of
Text-to-Image Models,
CVPR24(6527-6536)
IEEE DOI
2410
Generative AI, Face recognition, Semantics, Memory management,
Text to image, Graphics processing units, diffusion models,
subject driven personalization
BibRef
Zhang, Y.B.[Yan-Bing],
Yang, M.P.[Meng-Ping],
Zhou, Q.[Qin],
Wang, Z.[Zhe],
Attention Calibration for Disentangled Text-to-Image Personalization,
CVPR24(4764-4774)
IEEE DOI
2410
Visualization, Solid modeling, Image synthesis, Pipelines,
Text to image, Text-to-image, Personalization, Attention Calibration
BibRef
Dao, T.T.[Trung Tuan],
Vu, D.H.[Duc Hong],
Pham, C.[Cuong],
Tran, A.[Anh],
EFHQ: Multi-Purpose ExtremePose-Face-HQ Dataset,
CVPR24(22605-22615)
IEEE DOI
2410
Training, Deep learning, Face recognition, Pipelines, Text to image,
Benchmark testing
BibRef
Cazenavette, G.[George],
Sud, A.[Avneesh],
Leung, T.[Thomas],
Usman, B.[Ben],
FakeInversion: Learning to Detect Images from Unseen Text-to-Image
Models by Inverting Stable Diffusion,
CVPR24(10759-10769)
IEEE DOI
2410
Training, Visualization, Protocols, Text to image, Detectors,
Benchmark testing, Feature extraction, diffusion, fake detection
BibRef
Jayasumana, S.[Sadeep],
Glasner, D.[Daniel],
Ramalingam, S.[Srikumar],
Veit, A.[Andreas],
Chakrabarti, A.[Ayan],
Kumar, S.[Sanjiv],
MarkovGen: Structured Prediction for Efficient Text-to-Image
Generation,
CVPR24(9316-9325)
IEEE DOI
2410
Training, Image quality, Adaptation models, Image synthesis,
Computational modeling, Text to image, Predictive models, Image generation
BibRef
Shi, J.[Jing],
Xiong, W.[Wei],
Lin, Z.[Zhe],
Jung, H.J.[Hyun Joon],
InstantBooth: Personalized Text-to-Image Generation without Test-Time
Finetuning,
CVPR24(8543-8552)
IEEE DOI Code:
WWW Link.
2410
Image quality, Adaptation models, Technological innovation,
Image synthesis, Scalability, Text to image, image generation
BibRef
Liang, Y.[Youwei],
He, J.F.[Jun-Feng],
Li, G.[Gang],
Li, P.Z.[Pei-Zhao],
Klimovskiy, A.[Arseniy],
Carolan, N.[Nicholas],
Sun, J.[Jiao],
Pont-Tuset, J.[Jordi],
Young, S.[Sarah],
Yang, F.[Feng],
Ke, J.J.[Jun-Jie],
Dvijotham, K.D.[Krishnamurthy Dj],
Collins, K.M.[Katherine M.],
Luo, Y.W.[Yi-Wen],
Li, Y.[Yang],
Kohlhoff, K.J.[Kai J],
Ramachandran, D.[Deepak],
Navalpakkam, V.[Vidhya],
Rich Human Feedback for Text-to-Image Generation,
CVPR24(19401-19411)
IEEE DOI Code:
WWW Link.
2410
Image synthesis, Large language models, Text to image,
Training data, Reinforcement learning, Predictive models,
rich human feedback
BibRef
Li, X.[Xiang],
Shen, Q.L.[Qian-Li],
Kawaguchi, K.[Kenji],
VA3: Virtually Assured Amplification Attack on Probabilistic
Copyright Protection for Text-to-Image Generative Models,
CVPR24(12363-12373)
IEEE DOI Code:
WWW Link.
2410
Codes, Text to image, Closed box, Copyright protection,
Probabilistic logic, copyright protection,
text-to-image
BibRef
d'Incà, M.[Moreno],
Peruzzo, E.[Elia],
Mancini, M.[Massimiliano],
Xu, D.[Dejia],
Goe, V.[Vidit],
Xu, X.Q.[Xing-Qian],
Wang, Z.Y.[Zhang-Yang],
Shi, H.[Humphrey],
Sebe, N.[Nicu],
OpenBias: Open-Set Bias Detection in Text-to-Image Generative Models,
CVPR24(12225-12235)
IEEE DOI
2410
Limiting, Prevention and mitigation, Large language models,
Pipelines, Knowledge based systems, Text to image, Generative AI,
Text-to-Image
BibRef
Le Coz, A.[Adrien],
Ouertatani, H.[Houssem],
Herbin, S.[Stéphane],
Adjed, F.[Faouzi],
Efficient Exploration of Image Classifier Failures with Bayesian
Optimization and Text-to-Image Models,
GCV24(7569-7578)
IEEE DOI
2410
Training, Costs, Image synthesis, Computational modeling,
Text to image, Benchmark testing, image classifier failures,
bayesian optimization
BibRef
Wang, Y.L.[Yi-Lin],
Xu, H.Y.[Hai-Yang],
Zhang, X.[Xiang],
Chen, Z.Y.[Ze-Yuan],
Sha, Z.Z.[Zhi-Zhou],
Wang, Z.[Zirui],
Tu, Z.W.[Zhuo-Wen],
OmniControlNet: Dual-stage Integration for Conditional Image
Generation,
GCV24(7436-7448)
IEEE DOI
2410
Image synthesis, Image edge detection, Redundancy, Pipelines,
Text to image, Process control, Predictive models, Generative Models
BibRef
Ganz, R.[Roy],
Elad, M.[Michael],
CLIPAG: Towards Generator-Free Text-to-Image Generation,
WACV24(3831-3841)
IEEE DOI
2404
Computational modeling, Semantics,
Generators, Task analysis, Image classification, Algorithms,
Vision + language and/or other modalities
BibRef
Park, S.[Seongbeom],
Moon, S.H.[Su-Hong],
Park, S.H.[Seung-Hyun],
Kim, J.[Jinkyu],
Localization and Manipulation of Immoral Visual Cues for Safe
Text-to-Image Generation,
WACV24(4663-4672)
IEEE DOI
2404
Location awareness, Ethics, Visualization, Analytical models,
Image recognition, Computational modeling, Algorithms, Explainable,
Vision + language and/or other modalities
BibRef
Jeanneret, G.[Guillaume],
Simon, L.[Loïc],
Jurie, F.[Frédéric],
Text-to-Image Models for Counterfactual Explanations:
A Black-Box Approach,
WACV24(4745-4755)
IEEE DOI
2404
Analytical models, Codes, Computational modeling, Closed box,
Algorithms, Explainable, fair, accountable,
Vision + language and/or other modalities
BibRef
Grimal, P.[Paul],
Borgne, H.L.[Hervé Le],
Ferret, O.[Olivier],
Tourille, J.[Julien],
TIAM - A Metric for Evaluating Alignment in Text-to-Image Generation,
WACV24(2878-2887)
IEEE DOI
2404
Measurement, Image quality, Image color analysis,
Rendering (computer graphics), Colored noise, Algorithms,
Vision + language and/or other modalities
BibRef
Qin, C.[Can],
Yu, N.[Ning],
Xing, C.[Chen],
Zhang, S.[Shu],
Chen, Z.Y.[Ze-Yuan],
Ermon, S.[Stefano],
Fu, Y.[Yun],
Xiong, C.M.[Cai-Ming],
Xu, R.[Ran],
GlueGen: Plug and Play Multi-Modal Encoders for X-to-Image Generation,
ICCV23(23028-23039)
IEEE DOI
2401
BibRef
Lee, T.[Taegyeong],
Kang, J.[Jeonghun],
Kim, H.[Hyeonyu],
Kim, T.[Taehwan],
Generating Realistic Images from In-the-wild Sounds,
ICCV23(7126-7136)
IEEE DOI
2401
BibRef
Ye-Bin, M.[Moon],
Kim, J.[Jisoo],
Kim, H.Y.[Hong-Yeob],
Son, K.[Kilho],
Oh, T.H.[Tae-Hyun],
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation,
ICCV23(2526-2537)
IEEE DOI
2401
BibRef
Wu, X.S.[Xiao-Shi],
Sun, K.Q.[Ke-Qiang],
Zhu, F.[Feng],
Zhao, R.[Rui],
Li, H.S.[Hong-Sheng],
Human Preference Score: Better Aligning Text-to-image Models with
Human Preference,
ICCV23(2096-2105)
IEEE DOI Code:
WWW Link.
2401
BibRef
Le, T.V.[Thanh Van],
Phung, H.[Hao],
Nguyen, T.H.[Thuan Hoang],
Dao, Q.[Quan],
Tran, N.N.[Ngoc N.],
Tran, A.[Anh],
Anti-DreamBooth: Protecting users from personalized text-to-image
synthesis,
ICCV23(2116-2127)
IEEE DOI Code:
WWW Link.
2401
BibRef
Agarwal, A.[Aishwarya],
Karanam, S.[Srikrishna],
Joseph, K.J.,
Saxena, A.[Apoorv],
Goswami, K.[Koustava],
Srinivasan, B.V.[Balaji Vasan],
A-STAR: Test-time Attention Segregation and Retention for
Text-to-image Synthesis,
ICCV23(2283-2293)
IEEE DOI
2401
BibRef
Cho, J.[Jaemin],
Zala, A.[Abhay],
Bansal, M.[Mohit],
DALL-EVAL: Probing the Reasoning Skills and Social Biases of
Text-to-Image Generation Models,
ICCV23(3020-3031)
IEEE DOI
2401
BibRef
Zhang, C.[Cheng],
Chen, X.[Xuanbai],
Chai, S.Q.[Si-Qi],
Wu, C.H.[Chen Henry],
Lagun, D.[Dmitry],
Beeler, T.[Thabo],
de la Torre, F.[Fernando],
ITI-Gen: Inclusive Text-to-Image Generation,
ICCV23(3946-3957)
IEEE DOI
2401
BibRef
Struppek, L.[Lukas],
Hintersdorf, D.[Dominik],
Kersting, K.[Kristian],
Rickrolling the Artist: Injecting Backdoors into Text Encoders for
Text-to-Image Synthesis,
ICCV23(4561-4573)
IEEE DOI Code:
WWW Link.
2401
BibRef
Basu, A.[Abhipsa],
Babu, R.V.[R. Venkatesh],
Pruthi, D.[Danish],
Inspecting the Geographical Representativeness of Images from
Text-to-Image Models,
ICCV23(5113-5124)
IEEE DOI
2401
BibRef
Wang, S.Y.[Sheng-Yu],
Efros, A.A.[Alexei A.],
Zhu, J.Y.[Jun-Yan],
Zhang, R.[Richard],
Evaluating Data Attribution for Text-to-Image Models,
ICCV23(7158-7169)
IEEE DOI
2401
BibRef
Wei, Y.X.[Yu-Xiang],
Zhang, Y.[Yabo],
Ji, Z.L.[Zhi-Long],
Bai, J.F.[Jin-Feng],
Zhang, L.[Lei],
Zuo, W.M.[Wang-Meng],
ELITE: Encoding Visual Concepts into Textual Embeddings for
Customized Text-to-Image Generation,
ICCV23(15897-15907)
IEEE DOI Code:
WWW Link.
2401
BibRef
Bakr, E.M.[Eslam Mohamed],
Sun, P.Z.[Peng-Zhan],
Shen, X.Q.[Xiao-Qian],
Khan, F.F.[Faizan Farooq],
Li, L.E.[Li Erran],
Elhoseiny, M.[Mohamed],
HRS-Bench: Holistic, Reliable and Scalable Benchmark for
Text-to-Image Models,
ICCV23(19984-19996)
IEEE DOI Code:
WWW Link.
2401
BibRef
Lee, J.[Jaewoong],
Jang, S.[Sangwon],
Jo, J.[Jaehyeong],
Yoon, J.[Jaehong],
Kim, Y.J.[Yun-Ji],
Kim, J.H.[Jin-Hwa],
Ha, J.W.[Jung-Woo],
Hwang, S.J.[Sung Ju],
Text-Conditioned Sampling Framework for Text-to-Image Generation with
Masked Generative Models,
ICCV23(23195-23205)
IEEE DOI
2401
BibRef
Zhang, Z.Q.[Zhi-Qiang],
Xu, J.Y.[Jia-Yao],
Morita, R.[Ryugo],
Yu, W.X.[Wen-Xin],
Zhou, J.J.[Jin-Jia],
Dynamic Unilateral Dual Learning for Text to Image Synthesis,
ICIP23(1130-1134)
IEEE DOI
2312
BibRef
Mao, J.F.[Jia-Feng],
Wang, X.T.[Xue-Ting],
Training-Free Location-Aware Text-to-Image Synthesis,
ICIP23(995-999)
IEEE DOI
2312
BibRef
Morita, R.[Ryugo],
Zhang, Z.Q.[Zhi-Qiang],
Zhou, J.J.[Jin-Jia],
BATINeT: Background-Aware Text to Image Synthesis and Manipulation
Network,
ICIP23(765-769)
IEEE DOI
2312
BibRef
Yang, S.S.[Shu-Sheng],
Ge, Y.X.[Yi-Xiao],
Yi, K.[Kun],
Li, D.[Dian],
Shan, Y.[Ying],
Qie, X.H.[Xiao-Hu],
Wang, X.G.[Xing-Gang],
RILS: Masked Visual Reconstruction in Language Semantic Space,
CVPR23(23304-23314)
IEEE DOI
2309
BibRef
Zeng, Y.[Yu],
Lin, Z.[Zhe],
Zhang, J.M.[Jian-Ming],
Liu, Q.[Qing],
Collomosse, J.[John],
Kuen, J.[Jason],
Patel, V.M.[Vishal M.],
SceneComposer: Any-Level Semantic Image Synthesis,
CVPR23(22468-22478)
IEEE DOI
2309
BibRef
Lin, J.[Junfan],
Chang, J.L.[Jian-Long],
Liu, L.B.[Ling-Bo],
Li, G.B.[Guan-Bin],
Lin, L.[Liang],
Tian, Q.[Qi],
Chen, C.W.[Chang Wen],
Being Comes from Not-Being: Open-Vocabulary Text-to-Motion Generation
with Wordless Training,
CVPR23(23222-23231)
IEEE DOI
2309
BibRef
Yang, Z.Y.[Zheng-Yuan],
Wang, J.F.[Jian-Feng],
Gan, Z.[Zhe],
Li, L.J.[Lin-Jie],
Lin, K.[Kevin],
Wu, C.[Chenfei],
Duan, N.[Nan],
Liu, Z.C.[Zi-Cheng],
Liu, C.[Ce],
Zeng, M.[Michael],
Wang, L.J.[Li-Juan],
ReCo: Region-Controlled Text-to-Image Generation,
CVPR23(14246-14255)
IEEE DOI
2309
BibRef
Otani, M.[Mayu],
Togashi, R.[Riku],
Sawai, Y.[Yu],
Ishigami, R.[Ryosuke],
Nakashima, Y.[Yuta],
Rahtu, E.[Esa],
Heikkilä, J.[Janne],
Satoh, S.[Shin'ichi],
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image
Generation,
CVPR23(14277-14286)
IEEE DOI
2309
BibRef
Kang, M.[Minguk],
Zhu, J.Y.[Jun-Yan],
Zhang, R.[Richard],
Park, J.[Jaesik],
Shechtman, E.[Eli],
Paris, S.[Sylvain],
Park, T.[Taesung],
Scaling up GANs for Text-to-Image Synthesis,
CVPR23(10124-10134)
IEEE DOI
2309
BibRef
Careil, M.[Marlène],
Verbeek, J.[Jakob],
Lathuilière, S.[Stéphane],
Few-shot Semantic Image Synthesis with Class Affinity Transfer,
CVPR23(23611-23620)
IEEE DOI
2309
BibRef
Kang, M.S.[Min-Soo],
Lee, D.[Doyup],
Kim, J.[Jiseob],
Kim, S.[Saehoon],
Han, B.H.[Bo-Hyung],
Variational Distribution Learning for Unsupervised Text-to-Image
Generation,
CVPR23(23380-23389)
IEEE DOI
2309
BibRef
Sung-Bin, K.[Kim],
Senocak, A.[Arda],
Ha, H.W.[Hyun-Woo],
Owens, A.[Andrew],
Oh, T.H.[Tae-Hyun],
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment,
CVPR23(6430-6440)
IEEE DOI
2309
BibRef
Akula, A.R.[Arjun R.],
Driscoll, B.[Brendan],
Narayana, P.[Pradyumna],
Changpinyo, S.[Soravit],
Jia, Z.W.[Zhi-Wei],
Damle, S.[Suyash],
Pruthi, G.[Garima],
Basu, S.[Sugato],
Guibas, L.J.[Leonidas J.],
Freeman, W.T.[William T.],
Li, Y.Z.[Yuan-Zhen],
Jampani, V.[Varun],
MetaCLUE: Towards Comprehensive Visual Metaphors Research,
CVPR23(23201-23211)
IEEE DOI
2309
BibRef
Hwang, I.[Inwoo],
Kim, H.[Hyeonwoo],
Kim, Y.M.[Young Min],
Text2Scene: Text-driven Indoor Scene Stylization with Part-Aware
Details,
CVPR23(1890-1899)
IEEE DOI
2309
BibRef
Li, Y.H.[Yu-Heng],
Liu, H.T.[Hao-Tian],
Wu, Q.Y.[Qing-Yang],
Mu, F.Z.[Fang-Zhou],
Yang, J.W.[Jian-Wei],
Gao, J.F.[Jian-Feng],
Li, C.Y.[Chun-Yuan],
Lee, Y.J.[Yong Jae],
GLIGEN: Open-Set Grounded Text-to-Image Generation,
CVPR23(22511-22521)
IEEE DOI
2309
BibRef
Liang, M.L.[Ming-Liang],
Liu, Z.R.[Zhuo-Ran],
Larson, M.[Martha],
Textual Concept Expansion with Commonsense Knowledge to Improve
Dual-Stream Image-Text Matching,
MMMod23(I: 421-433).
Springer DOI
2304
Text as input, output concepts
BibRef
Zhou, L.L.[Long-Long],
Wu, X.J.[Xiao-Jun],
Xu, T.Y.[Tian-Yang],
COMIM-GAN: Improved Text-to-Image Generation via Condition Optimization
and Mutual Information Maximization,
MMMod23(I: 385-396).
Springer DOI
2304
BibRef
Kim, J.Y.[Jih-Yun],
Jeong, S.H.[Seong-Hun],
Kong, K.[Kyeongbo],
Kang, S.J.[Suk-Ju],
An Unified Framework for Language Guided Image Completion,
WACV23(2567-2577)
IEEE DOI
2302
Training, Visualization, Image synthesis, Computational modeling,
Natural languages, Complexity theory,
Vision + language and/or other modalities
BibRef
Li, H.[Hui],
Yuan, X.C.[Xu-Chang],
Image Generation Method of Bird Text Based on Improved StackGAN,
ICIVC22(805-811)
IEEE DOI
2301
Training, Image synthesis, Convolution, Computational modeling, Semantics,
Birds, Cultural differences, Text to image, StackGAN, Residual structure
BibRef
Li, B.[Bowen],
Word-Level Fine-Grained Story Visualization,
ECCV22(XXXVI:347-362).
Springer DOI
2211
BibRef
Tan, R.[Reuben],
Plummer, B.A.[Bryan A.],
Saenko, K.[Kate],
Lewis, J.P.,
Sud, A.[Avneesh],
Leung, T.[Thomas],
NewsStories: Illustrating Articles with Visual Summaries,
ECCV22(XXXVI:644-661).
Springer DOI
2211
BibRef
Roy, P.[Prasun],
Ghosh, S.[Subhankar],
Bhattacharya, S.[Saumik],
Pal, U.[Umapada],
Blumenstein, M.[Michael],
TIPS: Text-Induced Pose Synthesis,
ECCV22(XXXVIII:161-178).
Springer DOI
2211
BibRef
Yan, K.[Kun],
Ji, L.[Lei],
Wu, C.F.[Chen-Fei],
Bao, J.M.[Jian-Min],
Zhou, M.[Ming],
Duan, N.[Nan],
Ma, S.[Shuai],
Trace Controlled Text to Image Generation,
ECCV22(XXXVI:59-75).
Springer DOI
2211
BibRef
Dinh, T.M.[Tan M.],
Nguyen, R.[Rang],
Hua, B.S.[Binh-Son],
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation,
ECCV22(XXXVI:594-609).
Springer DOI
2211
BibRef
Zhang, J.H.[Jia-Hui],
Zhan, F.N.[Fang-Neng],
Theobalt, C.[Christian],
Lu, S.J.[Shi-Jian],
Regularized Vector Quantization for Tokenized Image Synthesis,
CVPR23(18467-18476)
IEEE DOI
2309
BibRef
Shim, S.H.[Sang-Heon],
Hyun, S.[Sangeek],
Bae, D.H.[Dae-Hyun],
Heo, J.P.[Jae-Pil],
Local Attention Pyramid for Scene Image Generation,
CVPR22(7764-7772)
IEEE DOI
2210
Measurement, Deep learning, Visualization, Image segmentation,
Image analysis, Image synthesis,
Scene analysis and understanding
BibRef
Sanghi, A.[Aditya],
Chu, H.[Hang],
Lambourne, J.G.[Joseph G.],
Wang, Y.[Ye],
Cheng, C.Y.[Chin-Yi],
Fumero, M.[Marco],
Malekshan, K.R.[Kamal Rahimi],
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation,
CVPR22(18582-18592)
IEEE DOI
2210
Training, Point cloud compression, Shape, Semantics,
Natural languages, Vision + graphics,
Vision+language
BibRef
Jain, A.[Ajay],
Mildenhall, B.[Ben],
Barron, J.T.[Jonathan T.],
Abbeel, P.[Pieter],
Poole, B.[Ben],
Zero-Shot Text-Guided Object Generation with Dream Fields,
CVPR22(857-866)
IEEE DOI
2210
Geometry, Visualization, Solid modeling, Image color analysis, Shape,
Deep learning architectures and techniques,
Vision applications and systems
BibRef
Bazazian, D.[Dena],
Calway, A.[Andrew],
Damen, D.[Dima],
Dual-Domain Image Synthesis using Segmentation-Guided GAN,
NTIRE22(506-515)
IEEE DOI
2210
Hair, Training, Image segmentation, Codes, Semantics, Nose, Mouth
BibRef
Qi, Y.G.[Yong-Gang],
Su, G.Y.[Guo-Yao],
Chowdhury, P.N.[Pinaki Nath],
Li, M.K.[Ming-Kang],
Song, Y.Z.[Yi-Zhe],
SketchLattice: Latticed Representation for Sketch Manipulation,
ICCV21(933-941)
IEEE DOI
2203
Image quality, Limiting, Computational modeling, Lattices,
Task analysis, Vision + other modalities,
Vision applications and systems
BibRef
Yang, L.[Lan],
Pang, K.Y.[Kai-Yue],
Zhang, H.G.[Hong-Gang],
Song, Y.Z.[Yi-Zhe],
SketchAA: Abstract Representation for Abstract Sketches,
ICCV21(10077-10086)
IEEE DOI
2203
Visualization, Image recognition, Codes, Computational modeling,
Image retrieval, Rendering (computer graphics),
Vision applications and systems
BibRef
Yuan, S.Z.[Shao-Zu],
Dai, A.[Aijun],
Yan, Z.L.[Zhi-Ling],
Guo, Z.[Zehua],
Liu, R.X.[Rui-Xue],
Chen, M.[Meng],
SketchBird: Learning to Generate Bird Sketches from Text,
SHE21(2443-2452)
IEEE DOI
2112
Fuses, Shape, Error analysis, Image edge detection,
Computational modeling
BibRef
Lu, X.P.[Xiao-Peng],
Ng, L.[Lynnette],
Fernandez, J.[Jared],
Zhu, H.[Hao],
CIGLI: Conditional Image Generation from Language & Image,
CLVL21(3127-3131)
IEEE DOI
2112
Codes, Image synthesis, Computational modeling,
Semantics, Cognition
BibRef
Long, J.[Jia],
Lu, H.T.[Hong-Tao],
Multi-level Gate Feature Aggregation with Spatially Adaptive
Batch-instance Normalization for Semantic Image Synthesis,
MMMod21(I:378-390).
Springer DOI
2106
BibRef
Yan, J.W.[Jia-Wei],
Lin, C.S.[Ci-Siang],
Yang, F.E.[Fu-En],
Li, Y.J.[Yu-Jhe],
Wang, Y.C.A.F.[Yu-Chi-Ang Frank],
Semantics-Guided Representation Learning with Applications to Visual
Synthesis,
ICPR21(7181-7187)
IEEE DOI
2105
Visualization, Interpolation,
Computational modeling, Semantics, Data visualization, Semantic interpolation
BibRef
Tang, S.C.[Shi-Chang],
Zhou, X.[Xu],
He, X.M.[Xu-Ming],
Ma, Y.[Yi],
Disentangled Representation Learning for Controllable Image
Synthesis: An Information-Theoretic Perspective,
ICPR21(10042-10049)
IEEE DOI
2105
Training, Image synthesis, Image color analysis,
Mutual information
BibRef
Ji, Z.Y.[Zhong-Yi],
Wang, W.M.[Wen-Min],
Chen, B.Y.[Bao-Yang],
Han, X.[Xiao],
Text-to-Image Generation via Semi-Supervised Training,
VCIP20(265-268)
IEEE DOI
2102
image classification, learning (artificial intelligence),
text analysis, visual databases, text-to-image generation,
Pseudo Feature
BibRef
Devaranjan, J.[Jeevan],
Kar, A.[Amlan],
Fidler, S.[Sanja],
Meta-SIM2:
Unsupervised Learning of Scene Structure for Synthetic Data Generation,
ECCV20(XVII:715-733).
Springer DOI
2011
WWW Link.
BibRef
Song, Y.Z.[Yun-Zhu],
Tam, Z.R.[Zhi Rui],
Chen, H.J.[Hung-Jen],
Lu, H.H.[Huiao-Han],
Shuai, H.H.[Hong-Han],
Character-preserving Coherent Story Visualization,
ECCV20(XVII:18-33).
Springer DOI
2011
BibRef
Achituve, I.[Idan],
Maron, H.[Haggai],
Chechik, G.[Gal],
Self-Supervised Learning for Domain Adaptation on Point Clouds,
WACV21(123-133)
IEEE DOI
2106
Phase change materials, Training,
Task analysis
BibRef
Rafique, M.U.[Muhammad Usman],
Zhang, Y.[Yu],
Brodie, B.[Benjamin],
Jacobs, N.[Nathan],
Unifying Guided and Unguided Outdoor Image Synthesis,
NTIRE21(776-785)
IEEE DOI
2109
Training, Image synthesis, Impedance matching, Layout,
Benchmark testing, Probabilistic logic
BibRef
Herzig, R.[Roei],
Bar, A.[Amir],
Xu, H.J.[Hui-Juan],
Chechik, G.[Gal],
Darrell, T.J.[Trevor J.],
Globerson, A.[Amir],
Learning Canonical Representations for Scene Graph to Image Generation,
ECCV20(XXVI:210-227).
Springer DOI
2011
BibRef
Zheng, H.T.[Hai-Tian],
Liao, H.[Haofu],
Chen, L.[Lele],
Xiong, W.[Wei],
Chen, T.L.[Tian-Lang],
Luo, J.B.[Jie-Bo],
Example-guided Image Synthesis Using Masked Spatial-channel Attention
and Self-supervision,
ECCV20(XIV:422-439).
Springer DOI
2011
BibRef
Vo, D.M.[Duc Minh],
Sugimoto, A.[Akihiro],
Visual-relation Conscious Image Generation from Structured-text,
ECCV20(XXVIII:290-306).
Springer DOI
2011
BibRef
Burns, A.[Andrea],
Kim, D.H.[Dong-Hyun],
Wijaya, D.[Derry],
Saenko, K.[Kate],
Plummer, B.A.[Bryan A.],
Learning to Scale Multilingual Representations for Vision-Language
Tasks,
ECCV20(IV:197-213).
Springer DOI
2011
BibRef
Huang, H.P.[Hsin-Ping],
Tseng, H.Y.[Hung-Yu],
Lee, H.Y.[Hsin-Ying],
Huang, J.B.[Jia-Bin],
Semantic View Synthesis,
ECCV20(XII: 592-608).
Springer DOI
2010
BibRef
Zhu, Z.[Zhen],
Xu, Z.L.[Zhi-Liang],
You, A.S.[An-Sheng],
Bai, X.[Xiang],
Semantically Multi-Modal Image Synthesis,
CVPR20(5466-5475)
IEEE DOI
2008
Semantics, Task analysis, Convolutional codes, Image generation,
Decoding, Generators, Controllability
BibRef
Liu, C.,
Mao, Z.,
Zhang, T.,
Xie, H.,
Wang, B.,
Zhang, Y.,
Graph Structured Network for Image-Text Matching,
CVPR20(10918-10927)
IEEE DOI
2008
Visualization, Dogs, Semantics, Sparse matrices,
Image edge detection, Learning systems, Feature extraction
BibRef
Yin, G.J.[Guo-Jun],
Liu, B.[Bin],
Sheng, L.[Lu],
Yu, N.H.[Neng-Hai],
Wang, X.G.[Xiao-Gang],
Shao, J.[Jing],
Semantics Disentangling for Text-To-Image Generation,
CVPR19(2322-2331).
IEEE DOI
2002
BibRef
Joseph, K.J.,
Pal, A.[Arghya],
Rajanala, S.[Sailaja],
Balasubramanian, V.N.[Vineeth N.],
C4Synth: Cross-Caption Cycle-Consistent Text-to-Image Synthesis,
WACV19(358-366)
IEEE DOI
1904
image capture, image processing, virtual reality, visual databases,
image editing, virtual reality, plausible image,
Data models
BibRef
Qi, X.,
Chen, Q.,
Jia, J.Y.[Jia-Ya],
Koltun, V.,
Semi-Parametric Image Synthesis,
CVPR18(8808-8816)
IEEE DOI
1812
Image segmentation, Semantics, Layout, Training, Image generation,
Image color analysis, Pipelines
BibRef
Sah, S.,
Peri, D.,
Shringi, A.,
Zhang, C.,
Dominguez, M.,
Savakis, A.,
Ptucha, R.,
Semantically Invariant Text-to-Image Generation,
ICIP18(3783-3787)
IEEE DOI
1809
Measurement, Image generation, Generators, Image quality, Detectors,
Visualization, Cost function
BibRef
Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Text to Image with Generative Adversarial Network .