baaivision / URSALinks
π» Uniform Discrete Diffusion with Metric Path for Video Generation
β49Updated this week
Alternatives and similar repositories for URSA
Users that are interested in URSA are comparing it to the libraries listed below
Sorting:
- β130Updated 2 weeks ago
- Official respository for ReasonGen-R1β71Updated 4 months ago
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Compreheβ¦β102Updated last month
- GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learningβ101Updated 5 months ago
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"β57Updated 6 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-projectβ178Updated 7 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ177Updated 5 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generationβ91Updated 7 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesisβ86Updated last year
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"β143Updated last week
- This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performβ¦β68Updated last month
- ICML2025β59Updated 2 months ago
- β161Updated 4 months ago
- T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generationβ30Updated last month
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"β70Updated last year
- FQGAN: Factorized Visual Tokenization and Generationβ54Updated 7 months ago
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generationβ71Updated last month
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"β36Updated 8 months ago
- [NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representationsβ178Updated last month
- the official repo for "D-AR: Diffusion via Autoregressive Models"β121Updated 4 months ago
- β119Updated 2 months ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controllerβ48Updated 2 months ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Attenβ¦β58Updated 3 months ago
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"β189Updated 4 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)β81Updated 8 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiTβ145Updated last week
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.β84Updated 5 months ago
- Official repository for the UAE paper, unified-GRPO, and unified-Benchβ142Updated last month
- [NeurIPS 2025] ViewPoint: Panoramic Video Generation with Pretrained Diffusion Modelsβ21Updated 3 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modelingβ39Updated 8 months ago