π₯π₯π₯ A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
β541Apr 4, 2025Updated 11 months ago
Alternatives and similar repositories for Awesome-LLMs-meet-Multimodal-Generation
Users that are interested in Awesome-LLMs-meet-Multimodal-Generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Let's finetune video generation models!β547Sep 15, 2025Updated 6 months ago
- A curated list of recent diffusion models for video generation, editing, and various other applications.β5,538Mar 14, 2026Updated 2 weeks ago
- Code for FreeTraj, a tuning-free method for trajectory-controllable video generationβ111Sep 19, 2025Updated 6 months ago
- π This is a repository for organizing papers, codes and other resources related to unified multimodal models.β807Oct 10, 2025Updated 5 months ago
- [ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.β1,903Jan 8, 2026Updated 2 months ago
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"β315Sep 28, 2025Updated 6 months ago
- Autoregressive Model Beats Diffusion: π¦ Llama for Scalable Image Generationβ1,940Aug 15, 2024Updated last year
- [CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longβ¦β322Mar 30, 2025Updated last year
- Official Implementation of VideoDPOβ163Jun 1, 2025Updated 9 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ636Oct 29, 2025Updated 5 months ago
- [ICLR 2024] Code for FreeNoise based on VideoCrafterβ428Aug 25, 2025Updated 7 months ago
- [NeurIPS 2025] T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoTβ432Sep 18, 2025Updated 6 months ago
- [ArXiv 2025] A survey about controllable video generation: This repo is the official awesome of "Controllable video generation: A surveyβ¦β706Nov 11, 2025Updated 4 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Alignersβ155Jul 6, 2024Updated last year
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.β509Mar 7, 2024Updated 2 years ago
- A reading list of video generationβ687Mar 23, 2026Updated last week
- π₯π₯π₯A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.β166Dec 26, 2024Updated last year
- [CSUR] A Survey on Video Diffusion Modelsβ2,281Mar 14, 2026Updated 2 weeks ago
- LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generationβ504Nov 16, 2024Updated last year
- A collection of awesome video generation studies.β753Dec 27, 2025Updated 3 months ago
- a collection of awesome autoregressive visual generation models