arXiv:2401.06578 Abstract | arXiv Analytics

arXiv:2401.06578 [cs.CV]Abstract References Reviews Resources

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, Jian Zhang

Published 2024-01-12Version 1

360-degree panoramic videos recently attract more interest in both studies and applications, courtesy of the heightened immersive experiences they engender. Due to the expensive cost of capturing 360-degree panoramic videos, generating desirable panoramic videos by given prompts is urgently required. Recently, the emerging text-to-video (T2V) diffusion methods demonstrate notable effectiveness in standard video generation. However, due to the significant gap in content and motion patterns between panoramic and standard videos, these methods encounter challenges in yielding satisfactory 360-degree panoramic videos. In this paper, we propose a controllable panorama video generation pipeline named 360-Degree Video Diffusion model (360DVD) for generating panoramic videos based on the given prompts and motion conditions. Concretely, we introduce a lightweight module dubbed 360-Adapter and assisted 360 Enhancement Techniques to transform pre-trained T2V models for 360-degree video generation. We further propose a new panorama dataset named WEB360 consisting of 360-degree video-text pairs for training 360DVD, addressing the absence of captioned panoramic video datasets. Extensive experiments demonstrate the superiority and effectiveness of 360DVD for panorama video generation. The code and dataset will be released soon.

Comments: arXiv admin note: text overlap with arXiv:2307.04725 by other authors

Categories: cs.CV

Keywords: video diffusion model, panoramic video, methods demonstrate notable effectiveness, dataset named web360 consisting, controllable panorama video generation pipeline

Related articles: Most relevant | Search more

arXiv:2312.02813 [cs.CV] (Published 2023-12-05)

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang

arXiv:1710.10755 [cs.CV] (Published 2017-10-30)

Modeling Attention in Panoramic Video: A Deep Reinforcement Learning Approach

Mai Xu, Yuhang Song, Jianyi Wang, Minglang Qiao, Liangyu Huo, Zulin Wang

arXiv:2403.11535 [cs.CV] (Published 2024-03-18, updated 2024-08-23)

AICL: Action In-Context Learning for Video Diffusion Model