arXiv:2409.06481 [cs.CV]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords dont want, source image, target image, multiple vision-language tasks, first large-scale dataset Tags Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset