arXiv:2403.12034 Abstract | arXiv Analytics

arXiv:2403.12034 [cs.CV]Abstract References Reviews Resources

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

Published 2024-03-18, updated 2024-07-18Version 2

This paper presents a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models. The primary obstacle in developing foundation 3D generative models is the limited availability of 3D data. Unlike images, texts, or videos, 3D data are not readily accessible and are difficult to acquire. This results in a significant disparity in scale compared to the vast quantities of other types of data. To address this issue, we propose using a video diffusion model, trained with extensive volumes of text, images, and videos, as a knowledge source for 3D data. By unlocking its multi-view generative capabilities through fine-tuning, we generate a large-scale synthetic multi-view dataset to train a feed-forward 3D generative model. The proposed model, VFusion3D, trained on nearly 3M synthetic multi-view data, can generate a 3D asset from a single image in seconds and achieves superior performance when compared to current SOTA feed-forward 3D generative models, with users preferring our results over 90% of the time.

Comments: ECCV 2024. Project page: https://junlinhan.github.io/projects/vfusion3d.html

Categories: cs.CV, cs.GR, cs.LG

Keywords: video diffusion model, learning scalable 3d generative models, feed-forward 3d generative model, pre-trained video diffusion, utilizing pre-trained video

Tags: github project

Related articles: Most relevant | Search more

arXiv:2305.10474 [cs.CV] (Published 2023-05-17)

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models

Songwei Ge et al.

arXiv:2403.11535 [cs.CV] (Published 2024-03-18, updated 2024-08-23)

AICL: Action In-Context Learning for Video Diffusion Model

Jianzhi Liu, Junchen Zhu, Lianli Gao, Heng Tao Shen, Jingkuan Song

arXiv:2506.17705 [cs.CV] (Published 2025-06-21)

DreamJourney: Perpetual View Generation with Video Diffusion Models

Bo Pan, Yang Chen, Yingwei Pan, Ting Yao, Wei Chen, Tao Mei

arXiv Analytics

arXiv:2403.12034 [cs.CV]Abstract References Reviews Resources

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Links

Toolbox

arXiv:2403.12034 [cs.CV]AbstractReferencesReviewsResources

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Links

Toolbox

arXiv:2403.12034 [cs.CV]Abstract References Reviews Resources