arXiv:2506.17705 Abstract | arXiv Analytics

arXiv:2506.17705 [cs.CV]Abstract References Reviews Resources

DreamJourney: Perpetual View Generation with Video Diffusion Models

Bo Pan, Yang Chen, Yingwei Pan, Ting Yao, Wei Chen, Tao Mei

Published 2025-06-21Version 1

Perpetual view generation aims to synthesize a long-term video corresponding to an arbitrary camera trajectory solely from a single input image. Recent methods commonly utilize a pre-trained text-to-image diffusion model to synthesize new content of previously unseen regions along camera movement. However, the underlying 2D diffusion model lacks 3D awareness and results in distorted artifacts. Moreover, they are limited to generating views of static 3D scenes, neglecting to capture object movements within the dynamic 4D world. To alleviate these issues, we present DreamJourney, a two-stage framework that leverages the world simulation capacity of video diffusion models to trigger a new perpetual scene view generation task with both camera movements and object dynamics. Specifically, in stage I, DreamJourney first lifts the input image to 3D point cloud and renders a sequence of partial images from a specific camera trajectory. A video diffusion model is then utilized as generative prior to complete the missing regions and enhance visual coherence across the sequence, producing a cross-view consistent video adheres to the 3D scene and camera trajectory. Meanwhile, we introduce two simple yet effective strategies (early stopping and view padding) to further stabilize the generation process and improve visual quality. Next, in stage II, DreamJourney leverages a multimodal large language model to produce a text prompt describing object movements in current view, and uses video diffusion model to animate current view with object movements. Stage I and II are repeated recurrently, enabling perpetual dynamic scene view generation. Extensive experiments demonstrate the superiority of our DreamJourney over state-of-the-art methods both quantitatively and qualitatively. Our project page: https://dream-journey.vercel.app.

Categories: cs.CV

Keywords: video diffusion model, perpetual view generation, dreamjourney, model lacks 3d awareness, perpetual dynamic scene view

Related articles: Most relevant | Search more

arXiv:2403.12034 [cs.CV] (Published 2024-03-18, updated 2024-07-18)

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

arXiv:2012.09855 [cs.CV] (Published 2020-12-17)

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

Andrew Liu, Richard Tucker, Varun Jampani, Ameesh Makadia, Noah Snavely, Angjoo Kanazawa

arXiv:2302.01329 [cs.CV] (Published 2023-02-02)

Dreamix: Video Diffusion Models are General Video Editors

Eyal Molad et al.

arXiv Analytics

arXiv:2506.17705 [cs.CV]Abstract References Reviews Resources

DreamJourney: Perpetual View Generation with Video Diffusion Models

Links

Toolbox

arXiv:2506.17705 [cs.CV]AbstractReferencesReviewsResources

DreamJourney: Perpetual View Generation with Video Diffusion Models

Links

Toolbox

arXiv:2506.17705 [cs.CV]Abstract References Reviews Resources