[ICLR 2026] π» Uniform Discrete Diffusion with Metric Path for Video Generation
β108Feb 6, 2026Updated last month
Alternatives and similar repositories for URSA
Users that are interested in URSA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)β49Apr 14, 2025Updated 11 months ago
- Repo of HawkLlama.β16Jan 2, 2025Updated last year
- EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memoryβ61Jan 13, 2026Updated 2 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantizationβ636Oct 29, 2025Updated 4 months ago
- β13Jun 22, 2025Updated 9 months ago
- [NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examplesβ65Oct 29, 2024Updated last year
- β39Mar 5, 2026Updated 2 weeks ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generationβ30Dec 22, 2025Updated 3 months ago
- β37Oct 21, 2022Updated 3 years ago
- [ICML 2024] Floating Anchor Diffusion Model for Multi-motif Scaffoldingβ31Aug 23, 2024Updated last year
- [ICLR 2024 Spotlight] The official repo for the paper "De novo Protein Design using Geometric Vector Field Networks".β30Aug 23, 2024Updated last year
- β18Aug 1, 2025Updated 7 months ago
- code for "TVG: A Training-free Transition Video Generation Method with Diffusion Models"β49Aug 19, 2024Updated last year
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)β53Jul 6, 2025Updated 8 months ago
- [ICLR'26] Official PyTorch implementation of "Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models".β63Mar 5, 2026Updated 2 weeks ago
- β49Oct 6, 2024Updated last year
- UniVid: The Open-Source Unified Video Modelβ30Oct 13, 2025Updated 5 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPOβ79Nov 17, 2025Updated 4 months ago
- β26Dec 19, 2025Updated 3 months ago
- [TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memoryβ19Apr 9, 2025Updated 11 months ago
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Imagesβ55Nov 4, 2025Updated 4 months ago
- β22Jul 5, 2025Updated 8 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perceptionβ159Dec 6, 2024Updated last year
- β41Oct 29, 2025Updated 4 months ago
- Wan: Open and Advanced Large-Scale Video Generative Modelsβ28Jul 28, 2025Updated 7 months ago
- [AAAI 2026] GenMAC for Compositional Text-to-Video Generationβ32Jan 10, 2026Updated 2 months ago
- Flux training codes (lora) for UniTEXβ24Jun 8, 2025Updated 9 months ago
- [IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Modelβ107Mar 24, 2025Updated last year
- [ICCV25] TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformersβ41Jul 23, 2025Updated 8 months ago
- β20Jan 1, 2026Updated 2 months ago
- A Mechanistic View on Video Generation as World Models: State and Dynamicsβ33Updated this week
- [NeurIPS 2024] AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videosβ23Dec 6, 2024Updated last year
- [CVPR 2024] Official PyTorch implementation of FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Compositionβ176Sep 1, 2025Updated 6 months ago
- Native Multimodal Models are World Learnersβ1,486Dec 30, 2025Updated 2 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationβ421Apr 25, 2025Updated 10 months ago
- [ICLR 2026] IVEBench - Benchmark for Instruction-Guided Video Editingβ71Jan 28, 2026Updated last month
- The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"β19May 2, 2025Updated 10 months ago
- Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"β42Feb 24, 2026Updated last month
- π Official code for βXStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compressionβ, β¦β40Jan 27, 2026Updated last month