11.14.3.7.1 Text to Video Synthesis, Text to Motion

Chapter Contents (Back)
Video Editing. Video Synthesis. Text to Video. Text to Motion. Editing.
See also Text to Image, Layout to Image, Image Based Rendering.
See also Video Diffusion, Video Sysnthesis.

Zhang, J.[Ji], Mei, K.Z.[Kui-Zhi], Zheng, Y.[Yu], Fan, J.P.[Jian-Ping],
Exploiting Mid-Level Semantics for Large-Scale Complex Video Classification,
MultMed(21), No. 10, October 2019, pp. 2518-2530.
IEEE DOI 1910
feature extraction, image classification, image motion analysis, image representation, large-scale video classification BibRef

Zhang, J.[Ji], Mei, K.Z.[Kui-Zhi], Wang, X., Zheng, Y.[Yu], Fan, J.P.[Jian-Ping],
From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification,
ICPR18(1695-1700)
IEEE DOI 1812
Semantics, Task analysis, Visualization, Streaming media, Detectors, Encoding, Bridges BibRef

Sener, F.[Fadime], Saraf, R.[Rishabh], Yao, A.[Angela],
Transferring Knowledge From Text to Video: Zero-Shot Anticipation for Procedural Actions,
PAMI(45), No. 6, June 2023, pp. 7836-7852.
IEEE DOI 2305
Visualization, Robots, Data models, Task analysis, Predictive models, Natural languages, Text recognition, Deep learning, video analysis BibRef

Fang, S.[Sheng], Dang, T.T.[Tian-Tian], Wang, S.H.[Shu-Hui], Huang, Q.M.[Qing-Ming],
Linguistic Hallucination for Text-Based Video Retrieval,
CirSysVideo(34), No. 10, October 2024, pp. 9692-9705.
IEEE DOI Code:
WWW Link. 2411
Linguistics, Training, Testing, Encoding, Context modeling, Feature extraction, Task analysis, Text-video retrieval, curriculum learning BibRef


Wu, W.J.[Wei-Jia], Li, Z.[Zhuang], Gu, Y.C.[Yu-Chao], Zhao, R.[Rui], He, Y.F.[Ye-Fei], Zhang, D.J.H.[David Jun-Hao], Shou, M.Z.[Mike Zheng], Li, Y.[Yan], Gao, T.T.[Ting-Ting], Zhang, D.[Di],
DragAnything: Motion Control for Anything Using Entity Representation,
ECCV24(XXII: 331-348).
Springer DOI 2412
Project:
WWW Link. in controllable video generation. BibRef

Chen, X.[Xi], Liu, Z.H.[Zhi-Heng], Chen, M.T.[Meng-Ting], Feng, Y.T.[Yu-Tong], Liu, Y.[Yu], Shen, Y.J.[Yu-Jun], Zhao, H.S.[Heng-Shuang],
Livephoto: Real Image Animation with Text-guided Motion Control,
ECCV24(XVIII: 475-491).
Springer DOI 2412
Project:
WWW Link. BibRef

Dai, W.X.[Wen-Xun], Chen, L.H.[Ling-Hao], Wang, J.B.[Jing-Bo], Liu, J.P.[Jin-Peng], Dai, B.[Bo], Tang, Y.S.[Yan-Song],
Motionlcm: Real-time Controllable Motion Generation via Latent Consistency Model,
ECCV24(XVI: 390-408).
Springer DOI 2412
BibRef

Huang, Y.M.[Yi-Ming], Wan, W.L.[Wei-Lin], Yang, Y.[Yue], Callison-Burch, C.[Chris], Yatskar, M.[Mark], Liu, L.J.[Ling-Jie],
Como: Controllable Motion Generation Through Language Guided Pose Code Editing,
ECCV24(XXIX: 180-196).
Springer DOI 2412
BibRef

Sampieri, A.[Alessio], Palma, A.[Alessio], Spinelli, I.[Indro], Galasso, F.[Fabio],
Length-aware Motion Synthesis via Latent Diffusion,
ECCV24(LIII: 107-124).
Springer DOI 2412
BibRef

Zhu, L.[Lin], Zheng, Y.L.[Yun-Long], Zhang, Y.J.[Yi-Jun], Wang, X.[Xiao], Wang, L.Z.[Li-Zhi], Huang, H.[Hua],
Temporal Residual Guided Diffusion Framework for Event-driven Video Reconstruction,
ECCV24(XL: 411-427).
Springer DOI 2412
BibRef

Zou, Q.[Qiran], Yuan, S.[Shangyuan], Du, S.[Shian], Wang, Y.[Yu], Liu, C.[Chang], Xu, Y.[Yi], Chen, J.[Jie], Ji, X.Y.[Xiang-Yang],
Parco: Part-coordinating Text-to-motion Synthesis,
ECCV24(LVI: 126-143).
Springer DOI 2412
BibRef

Jin, P.[Peng], Li, H.[Hao], Cheng, Z.[Zesen], Li, K.[Kehan], Yu, R.[Runyi], Liu, C.[Chang], Ji, X.Y.[Xiang-Yang], Yuan, L.[Li], Chen, J.[Jie],
Local Action-guided Motion Diffusion Model for Text-to-motion Generation,
ECCV24(XXV: 392-409).
Springer DOI 2412
BibRef

Chi, S.G.[Seung-Geun], Chi, H.G.[Hyung-Gun], Ma, H.[Hengbo], Agarwal, N.[Nakul], Siddiqui, F.[Faizan], Ramani, K.[Karthik], Lee, K.[Kwonjoon],
M2d2m: Multi-Motion Generation from Text with Discrete Diffusion Models,
ECCV24(XIV: 18-36).
Springer DOI 2412
BibRef

Bahmani, S.[Sherwin], Liu, X.[Xian], Wang, Y.F.[Yi-Fan], Skorokhodov, I.[Ivan], Rong, V.[Victor], Liu, Z.W.[Zi-Wei], Liu, X.H.[Xi-Hui], Park, J.J.[Jeong Joon], Tulyakov, S.[Sergey], Wetzstein, G.[Gordon], Tagliasacchi, A.[Andrea], Lindell, D.B.[David B.],
TC4D: Trajectory-Conditioned Text-to-4D Generation,
ECCV24(XLVI: 53-72).
Springer DOI 2412
BibRef

Fan, K.[Ke], Tang, J.[Junshu], Cao, W.J.[Wei-Jian], Yi, R.[Ran], Li, M.[Moran], Gong, J.Y.[Jing-Yu], Zhang, J.N.[Jiang-Ning], Wang, Y.[Yabiao], Wang, C.J.[Cheng-Jie], Ma, L.Z.[Li-Zhuang],
Freemotion: A Unified Framework for Number-Free Text-to-Motion Synthesis,
ECCV24(VIII: 93-109).
Springer DOI 2412
BibRef

Oh, G.[Gyeongrok], Jeong, J.[Jaehwan], Kim, S.[Sieun], Byeon, W.[Wonmin], Kim, J.[Jinkyu], Kim, S.[Sungwoong], Kim, S.[Sangpil],
Mevg: Multi-event Video Generation with Text-to-video Models,
ECCV24(XLIII: 401-418).
Springer DOI 2412
BibRef

Girdhar, R.[Rohit], Singh, M.[Mannat], Brown, A.[Andrew], Duval, Q.[Quentin], Azadi, S.[Samaneh], Rambhatla, S.S.[Sai Saketh], Shah, A.[Akbar], Yin, X.[Xi], Parikh, D.[Devi], Misra, I.[Ishan],
Factorizing Text-to-video Generation by Explicit Image Conditioning,
ECCV24(LXII: 205-224).
Springer DOI 2412
BibRef

Materzynska, J.[Joanna], Sivic, J.[Josef], Shechtman, E.[Eli], Torralba, A.[Antonio], Zhang, R.[Richard], Russell, B.[Bryan],
Newmove: Customizing Text-to-video Models with Novel Motions,
ACCV24(V: 113-130).
Springer DOI 2412
BibRef

Ren, Y.X.[Yi-Xuan], Zhou, Y.[Yang], Yang, J.[Jimei], Shi, J.[Jing], Liu, D.[Difan], Liu, F.[Feng], Kwon, M.[Mingi], Shrivastava, A.[Abhinav],
Customize-a-video: One-shot Motion Customization of Text-to-video Diffusion Models,
ECCV24(LXXXIX: 332-349).
Springer DOI 2412
BibRef

Zhang, J.T.[Jun-Tao], Liu, Y.[Yuehuai], Tai, Y.W.[Yu-Wing], Tang, C.K.[Chi-Keung],
C3Net: Compound Conditioned ControlNet for Multimodal Content Generation,
CVPR24(26876-26885)
IEEE DOI 2410
Training, Interpolation, Semantics, Training data, Computer architecture, Aerospace electronics, Diffusion models, BibRef

Li, Z.Q.[Zheng-Qi], Tucker, R.[Richard], Snavely, N.[Noah], Holynski, A.[Aleksander],
Generative Image Dynamics,
CVPR24(24142-24153)
IEEE DOI 2410
Code:
WWW Link. Solid modeling, Dynamics, Video sequences, Predictive models, Diffusion models, Rendering (computer graphics), Turning BibRef

Zhuang, S.[Shaobin], Li, K.[Kunchang], Chen, X.Y.[Xin-Yuan], Wang, Y.[Yaohui], Liu, Z.W.[Zi-Wei], Qiao, Y.[Yu], Wang, Y.[Yali],
Vlogger: Make Your Dream A Vlog,
CVPR24(8806-8817)
IEEE DOI 2410
Training, Visualization, Spatial coherence, Coherence, Diffusion models, Boosting, Planning, spatial-temporal coherence BibRef

Zeng, Y.[Yan], Wei, G.Q.[Guo-Qiang], Zheng, J.[Jiani], Zou, J.X.[Jia-Xin], Wei, Y.[Yang], Zhang, Y.C.[Yu-Chen], Li, H.[Hang],
Make Pixels Dance: High-Dynamic Video Generation,
CVPR24(8850-8860)
IEEE DOI 2410
Training, Humanities, Focusing, Diffusion models, Visual effects, Video Generation, Diffusion Models BibRef

Zhang, Z.C.[Zhi-Cheng], Hu, J.[Junyao], Cheng, W.T.[Wen-Tao], Paudel, D.[Danda], Yang, J.F.[Ju-Feng],
ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction,
CVPR24(19310-19320)
IEEE DOI 2410
Extrapolation, Solid modeling, Uncertainty, Computational modeling, Predictive models, Diffusion models, Video Generation, Diffusion Model BibRef

Huang, Z.Q.[Zi-Qi], He, Y.[Yinan], Yu, J.[Jiashuo], Zhang, F.[Fan], Si, C.Y.[Chen-Yang], Jiang, Y.M.[Yu-Ming], Zhang, Y.H.[Yuan-Han], Wu, T.X.[Tian-Xing], Jin, Q.Y.[Qing-Yang], Chanpaisit, N.[Nattapol], Wang, Y.[Yaohui], Chen, X.Y.[Xin-Yuan], Wang, L.M.[Li-Min], Lin, D.[Dahua], Qiao, Y.[Yu], Liu, Z.W.[Zi-Wei],
VBench: Comprehensive Benchmark Suite for Video Generative Models,
CVPR24(21807-21818)
IEEE DOI 2410
Measurement, Image synthesis, Annotations, Computational modeling, Benchmark testing, evaluation, human preference BibRef

Skorokhodov, I.[Ivan], Menapace, W.[Willi], Siarohin, A.[Aliaksandr], Tulyakov, S.[Sergey],
Hierarchical Patch Diffusion Models for High-Resolution Video Generation,
CVPR24(7569-7579)
IEEE DOI Code:
WWW Link. 2410
Training, Limiting, Scalability, Computational modeling, Pipelines, Diffusion models, Generators, video generation, diffusion models, efficiency BibRef

Jiang, Y.M.[Yu-Ming], Wu, T.X.[Tian-Xing], Yang, S.[Shuai], Si, C.Y.[Chen-Yang], Lin, D.[Dahua], Qiao, Y.[Yu], Loy, C.C.[Chen Change], Liu, Z.W.[Zi-Wei],
VideoBooth: Diffusion-based Video Generation with Image Prompts,
CVPR24(6689-6700)
IEEE DOI 2410
Visualization, Image coding, Accuracy, Computational modeling, Video Generation, Diffusion Models BibRef

Wei, Y.J.[Yu-Jie], Zhang, S.W.[Shi-Wei], Qing, Z.W.[Zhi-Wu], Yuan, H.J.[Hang-Jie], Liu, Z.H.[Zhi-Heng], Liu, Y.[Yu], Zhang, Y.[Yingya], Zhou, J.[Jingren], Shan, H.M.[Hong-Ming],
Dream Video: Composing Your Dream Videos with Customized Subject and Motion,
CVPR24(6537-6549)
IEEE DOI Code:
WWW Link. 2410
Adaptation models, Image synthesis, Computational modeling, Diffusion models, Controllability, customized generation BibRef

Cai, S.Q.[Sheng-Qu], Ceylan, D.[Duygu], Gadelha, M.[Matheus], Huang, C.H.P.[Chun-Hao Paul], Wang, T.Y.[Tuanfeng Yang], Wetzstein, G.[Gordon],
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models,
CVPR24(7611-7620)
IEEE DOI 2410
Geometry, Pipelines, Text to image, Manuals, Diffusion models, Rendering (computer graphics), Computer Graphics, Animation BibRef

Menapace, W.[Willi], Siarohin, A.[Aliaksandr], Skorokhodov, I.[Ivan], Deyneka, E.[Ekaterina], Chen, T.S.[Tsai-Shien], Kag, A.[Anil], Fang, Y.W.[Yu-Wei], Stoliar, A.[Aleksei], Ricci, E.[Elisa], Ren, J.[Jian], Tulyakov, S.[Sergey],
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis,
CVPR24(7038-7048)
IEEE DOI 2410
Training, Visualization, Image coding, Computational modeling, Scalability, Computer architecture, Transformers, video generation, efficiency BibRef

Tian, K.B.[Kai-Bin], Zhao, R.X.[Rui-Xiang], Xin, Z.J.[Zi-Jie], Lan, B.X.[Bang-Xiang], Li, X.R.[Xi-Rong],
Holistic Features are Almost Sufficient for Text-to-Video Retrieval,
CVPR24(17138-17147)
IEEE DOI 2410
Computational modeling, Scalability, Ad hoc networks, Text to video BibRef

Gal, R.[Rinon], Vinker, Y.[Yael], Alaluf, Y.[Yuval], Bermano, A.[Amit], Cohen-Or, D.[Daniel], Shamir, A.[Ariel], Chechik, G.[Gal],
Breathing Life Into Sketches Using Text-to-Video Priors,
CVPR24(4325-4336)
IEEE DOI 2410
Training, Deformation, Animation, Diffusion models, Vectors, Sketch animation, text-to-video, diffusion score distillation BibRef

Jain, Y.[Yash], Nasery, A.[Anshul], Vineet, V.[Vibhav], Behl, H.[Harkirat],
Peekaboo: Interactive Video Generation via Masked-Diffusion,
CVPR24(8079-8088)
IEEE DOI 2410
Training, Codes, Computational modeling, Benchmark testing, Creativity, video generation, diffusion, text to video BibRef

Yatim, D.[Danah], Fridman, R.[Rafail], Bar-Tal, O.[Omer], Kasten, Y.[Yoni], Dekel, T.[Tali],
Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer,
CVPR24(8466-8476)
IEEE DOI Code:
WWW Link. 2410
Shape, Layout, Dogs, Diffusion models, Text to video, video editing, motion transfer, diffusion models BibRef

Qing, Z.W.[Zhi-Wu], Zhang, S.W.[Shi-Wei], Wang, J.[Jiayu], Wang, X.[Xiang], Wei, Y.J.[Yu-Jie], Zhang, Y.[Yingya], Gao, C.X.[Chang-Xin], Sang, N.[Nong],
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation,
CVPR24(6635-6645)
IEEE DOI 2410
Training, Source coding, Semantics, Spatial coherence, Cognition, Stability analysis, Complexity theory BibRef

Wang, X.[Xiang], Zhang, S.W.[Shi-Wei], Yuan, H.J.[Hang-Jie], Qing, Z.W.[Zhi-Wu], Gong, B.[Biao], Zhang, Y.[Yingya], Shen, Y.J.[Yu-Jun], Gao, C.X.[Chang-Xin], Sang, N.[Nong],
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos,
CVPR24(6572-6582)
IEEE DOI 2410
Training, Video on demand, Scalability, Pipelines, Text to image, Performance gain BibRef

Kim, T.[Taehoon], Kang, C.[ChanHee], Park, J.[JaeHyuk], Jeong, D.[Daun], Yang, C.[ChangHee], Kang, S.J.[Suk-Ju], Kong, K.[Kyeongbo],
Human Motion Aware Text-to-Video Generation with Explicit Camera Control,
WACV24(5069-5078)
IEEE DOI Code:
WWW Link. 2404
Knowledge engineering, Codes, Punching, Cameras, Algorithms, Generative models for image, video, 3D, etc., Algorithms, Biometrics, Vision + language and/or other modalities BibRef

Chen, S.[Shoufa], Xu, M.M.[Meng-Meng], Ren, J.W.[Jia-Wei], Cong, Y.[Yuren], He, S.[Sen], Xie, Y.P.[Yan-Ping], Sinha, A.[Animesh], Luo, P.[Ping], Xiang, T.[Tao], Perez-Rua, J.M.[Juan-Manuel],
GenTron: Diffusion Transformers for Image and Video Generation,
CVPR24(6441-6451)
IEEE DOI Code:
WWW Link. 2410
Visualization, Adaptation models, Scalability, Transformers, Diffusion models, Quality assessment, Diffusion Transformers, Text-to-Video Generation BibRef

Lee, T.[Taegyeong], Kwon, S.[Soyeong], Kim, T.[Taehwan],
Grid Diffusion Models for Text-to-Video Generation,
CVPR24(8734-8743)
IEEE DOI 2410
Visualization, Computational modeling, Memory management, Text to image, Graphics processing units, Diffusion models, Diffusion models BibRef

Eldesokey, A.[Abdelrahman], Wonka, P.[Peter],
LatentMan: Generating Consistent Animated Characters using Image Diffusion Models,
GCV24(7510-7519)
IEEE DOI 2410
Bridges, Visualization, Computational modeling, Text to image, Diffusion processes, Diffusion Models, Text-to-Video, Animation, Text-to-Image BibRef

Yuan, X.[Xin], Baek, J.[Jinoo], Xu, K.[Keyang], Tov, O.[Omer], Fei, H.L.[Hong-Liang],
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution,
VAQuality24(489-496)
IEEE DOI 2404
Adaptation models, Visualization, Computational modeling, Superresolution, Computer architecture BibRef

Wu, J.Z.J.[Jay Zhang-Jie], Ge, Y.X.[Yi-Xiao], Wang, X.[Xintao], Lei, S.W.X.[Stan Wei-Xian], Gu, Y.C.[Yu-Chao], Shi, Y.F.[Yu-Fei], Hsu, W.[Wynne], Shan, Y.[Ying], Qie, X.[Xiaohu], Shou, M.Z.[Mike Zheng],
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation,
ICCV23(7589-7599)
IEEE DOI 2401
BibRef

Weng, W.M.[Wen-Ming], Feng, R.[Ruoyu], Wang, Y.H.[Yan-Hui], Dai, Q.[Qi], Wang, C.Y.[Chun-Yu], Yin, D.C.[Da-Cheng], Zhao, Z.Y.[Zhi-Yuan], Qiu, K.[Kai], Bao, J.M.[Jian-Min], Yuan, Y.H.[Yu-Hui], Luo, C.[Chong], Zhang, Y.Y.[Yue-Yi], Xiong, Z.W.[Zhi-Wei],
ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models,
GCV24(7395-7405)
IEEE DOI 2410
Noise, Training data, Coherence, Predictive models, Diffusion models BibRef

Ji, P.L.[Peng-Liang], Xiao, C.[Chuyang], Tai, H.L.[Hui-Lin], Huo, M.X.[Ming-Xiao],
T2VBench: Benchmarking Temporal Dynamics for Text-to-Video Generation,
GenerativeFM24(5325-5335)
IEEE DOI 2410
Measurement, Analytical models, Computational modeling, Encyclopedias, Benchmark testing, multimodal BibRef

Godfrey, W.W.[W. Wilfred], Ratna, A.[Abhinav],
Enhancing the Video Editing Capabilities of Text-to-Video Generators Using DDPM Inversion,
ICCVMI23(1-5)
IEEE DOI 2403
Visualization, Computational modeling, Refining, Pipelines, Noise reduction, Transformers, Probabilistic logic, DDPM Inversion BibRef

Chapter on 3-D Object Description and Computation Techniques, Surfaces, Deformable, View Generation, Video Conferencing continues in
Video Diffusion, Video Sysnthesis .


Last update:Mar 12, 2025 at 14:27:03