arXiv:2104.10157 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords video generation, transformer, generate high fidelity natural images, learns downsampled discrete latent representations, architecture Tags github project Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset