☆43Feb 20, 2026Updated last month
Alternatives and similar repositories for sparse-causal-diffusion
Users that are interested in sparse-causal-diffusion are comparing it to the libraries listed below
Sorting:
- Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations☆22Dec 24, 2025Updated 2 months ago
- Scalable Minecraft multiplayer data collection engine☆114Updated this week
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆12May 5, 2025Updated 10 months ago
- (CVPR Workshop Best Paper Award) Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustn…☆17Nov 4, 2025Updated 4 months ago
- 4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere☆93Feb 11, 2026Updated last month
- (ICCV 2025) OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation☆15Oct 11, 2025Updated 5 months ago
- ☆48Dec 9, 2025Updated 3 months ago
- DreamStyle: A Unified Framework for Video Stylization☆113Jan 7, 2026Updated 2 months ago
- ICLR 2025 paper X-NeMo & Project X-Portrati2☆120Aug 7, 2025Updated 7 months ago
- ☆82Oct 13, 2025Updated 5 months ago
- Simple MoE - Day 17 of 365 Days of Repos☆17Jan 17, 2025Updated last year
- Official code for ICLR 2024 paper, SEABO: A Simple Search-Based Method for Offline Imitation Learning☆12Jan 19, 2024Updated 2 years ago
- RLHF for Video Diffusion Models☆26Jul 30, 2025Updated 7 months ago
- Overworld's local world client interface to run Waypoint world models☆46Mar 13, 2026Updated last week
- Official Codebase for "Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control" (NeurIPS 2024)☆15Oct 29, 2024Updated last year
- Codebase for the paper-Elucidating the design space of language models for image generation☆46Nov 17, 2024Updated last year
- MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation☆40Nov 4, 2025Updated 4 months ago
- Official implementation of "MV-TAP: Tracking Any Point in Multi-View Videos"☆39Mar 10, 2026Updated last week
- ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation☆15Jun 2, 2024Updated last year
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆48Feb 10, 2026Updated last month
- This repo implements Video generation model using Latent Diffusion Transformers(Latte) in PyTorch and provides training and inference cod…☆17Jan 6, 2025Updated last year
- Original code base for On Pretraining Data Diversity for Self-Supervised Learning☆14Dec 30, 2024Updated last year
- ☆41Mar 11, 2026Updated last week
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆40Sep 18, 2024Updated last year
- ☆21Apr 15, 2024Updated last year
- [NeurIPS 2025, Spotlight]: Ambient-o: Training Good models with Bad Data.☆33Jan 21, 2026Updated last month
- ☆12Mar 11, 2025Updated last year
- Official implementation of "Repurposing Video Diffusion Transformers for Robust Point Tracking"☆41Dec 24, 2025Updated 2 months ago
- AIS 2024 Challenge, Real-Time 4K Super-Resolution of Compressed AVIF Images (Runner-Up Award in Track: Fidelity PSNR), Team XJTU-AIR☆22Jul 23, 2024Updated last year
- Macro-from-Micro Planning for High-Quality and Parallelized Autoregressive Long Video Generation☆37Oct 31, 2025Updated 4 months ago
- Implementation of Prompt-to-Prompt Image Editing with Cross Attention Control☆16Apr 5, 2023Updated 2 years ago
- DELTA: Dense Efficient Long-range 3D Tracking for Any video (ICLR 2025)☆138Apr 6, 2025Updated 11 months ago
- Vico: Compositional Video Generation as Flow Equalization☆59Nov 15, 2024Updated last year
- ☆122Mar 7, 2026Updated last week
- Official repository of SoftREPA: Aligning Text to Image in Diffusion Models is Easier Than You Think☆19Jun 5, 2025Updated 9 months ago
- Implementation of "VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space" - ECCV 2024☆13Mar 24, 2025Updated 11 months ago
- ☆65Updated this week
- Official implementation of Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion☆70Updated this week
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23Mar 13, 2026Updated last week