baaivision / URSALinks
π» Uniform Discrete Diffusion with Metric Path for Video Generation
β83Updated last week
Alternatives and similar repositories for URSA
Users that are interested in URSA are comparing it to the libraries listed below
Sorting:
- β51Updated last year
- Official Repo for Self-Forcing++ High Quality Long Video Generationβ214Updated 2 months ago
- Official PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiTβ156Updated 2 months ago
- [NeurIPS 2025] The official repository of "Sekai: A Video Dataset towards World Exploration"β217Updated 3 weeks ago
- [CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesisβ62Updated 8 months ago
- Lumos Project: Frontier video unified model research by Alibaba DAMO Academy.β148Updated 5 months ago
- [ICCV 2025] The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"β58Updated 8 months ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controllerβ49Updated 4 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-projectβ184Updated 9 months ago
- [CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Textβ53Updated 9 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"β165Updated last week
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generationβ46Updated 4 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)β86Updated 10 months ago
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Compreheβ¦β112Updated 3 months ago
- Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPOβ76Updated 3 weeks ago
- [NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representationsβ192Updated 3 months ago
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"β195Updated 3 weeks ago
- β140Updated 2 months ago
- the official repo for "D-AR: Diffusion via Autoregressive Models"β129Updated 6 months ago
- β34Updated last year
- Official respository for ReasonGen-R1β73Updated 6 months ago
- Official PyTorch implementation - Video Motion Transfer with Diffusion Transformersβ76Updated 5 months ago
- [ NeurIPS 2024 D&B Track ] Implementation for "FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models"β73Updated last year
- Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".β115Updated last week
- [ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decodingβ47Updated 8 months ago
- GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learningβ101Updated 7 months ago
- Official implementation of our paper: "Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing" β¦β74Updated 7 months ago
- β47Updated 8 months ago
- CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151β86Updated 7 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesisβ129Updated 7 months ago