haoningwu3639 / SimpleSDM-3Links
A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.
β21Updated last month
Alternatives and similar repositories for SimpleSDM-3
Users that are interested in SimpleSDM-3 are comparing it to the libraries listed below
Sorting:
- [CVPR 2025π₯] Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Modelβ158Updated 2 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-projectβ166Updated 4 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiTβ120Updated 3 months ago
- Frequency Autoregressive Image Generation with Continuous Tokensβ79Updated last month
- CAR: Controllable AutoRegressive Modeling for Visual Generationβ121Updated 7 months ago
- This is the official implementation for ControlVAR.β116Updated 7 months ago
- β114Updated last month
- Implements VAR+CLIP for text-to-image (T2I) generationβ142Updated 5 months ago
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representationsβ117Updated last week
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ141Updated last month
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generationβ129Updated last month
- Training-Free Condition-Guided Text-to-Video Generationβ61Updated 3 months ago
- π₯stable, simple, state-of-the-art VQVAE toolkit & cookbookβ94Updated last year
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generationβ114Updated 8 months ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Attenβ¦β53Updated 2 weeks ago
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficientβ102Updated 3 months ago
- Autoregressive Image Generation with Randomized Parallel Decodingβ69Updated 3 months ago
- [CVPR 2025] T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generationβ87Updated last month
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physicsβ128Updated 2 months ago
- Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"β230Updated 2 months ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"β33Updated 5 months ago
- Official implementation of "STAR: Scale-wise Text-to-image generation via Auto-Regressive representations"β35Updated 4 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)β73Updated 4 months ago
- [NeurIPS 2024] Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillationβ66Updated 8 months ago
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β385Updated last month
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Modelsβ280Updated 2 months ago
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]β92Updated 5 months ago
- β33Updated 9 months ago
- FQGAN: Factorized Visual Tokenization and Generationβ51Updated 3 months ago
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generationβ316Updated last month