arXiv:2404.01367 Abstract | arXiv Analytics

arXiv:2404.01367 [cs.CV]Abstract References Reviews Resources

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar

Published 2024-04-01Version 1

We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency. While improved network architecture and inference algorithms have shown to effectively boost sampling efficiency of diffusion models, the role of model size -- a critical determinant of sampling efficiency -- has not been thoroughly examined. Through empirical analysis of established text-to-image diffusion models, we conduct an in-depth investigation into how model size influences sampling efficiency across varying sampling steps. Our findings unveil a surprising trend: when operating under a given inference budget, smaller models frequently outperform their larger equivalents in generating high-quality results. Moreover, we extend our study to demonstrate the generalizability of the these findings by applying various diffusion samplers, exploring diverse downstream tasks, evaluating post-distilled models, as well as comparing performance relative to training compute. These findings open up new pathways for the development of LDM scaling strategies which can be employed to enhance generative capabilities within limited inference budgets.

Categories: cs.CV, cs.LG

Keywords: latent diffusion models, scaling properties, inference budget, text-to-image diffusion models, smaller models frequently outperform

Related articles: Most relevant | Search more

arXiv:2211.01324 [cs.CV] (Published 2022-11-02)

eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Yogesh Balaji et al.

arXiv:2304.08818 [cs.CV] (Published 2023-04-18)

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

arXiv:2308.12453 [cs.CV] (Published 2023-08-23)

Augmenting medical image classifiers with synthetic data from latent diffusion models

Luke W. Sagers et al.

arXiv Analytics

arXiv:2404.01367 [cs.CV]Abstract References Reviews Resources

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Links

Toolbox

arXiv:2404.01367 [cs.CV]AbstractReferencesReviewsResources

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Links

Toolbox

arXiv:2404.01367 [cs.CV]Abstract References Reviews Resources