haoningwu3639 / SimpleSDM-Video
A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.
☆17Updated last year
Alternatives and similar repositories for SimpleSDM-Video:
Users that are interested in SimpleSDM-Video are comparing it to the libraries listed below
- A simple and flexible PyTorch implementation of StableDiffusion based on diffusers.☆23Updated 7 months ago
- A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.☆18Updated 3 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆31Updated 2 months ago
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆86Updated 9 months ago
- ☆22Updated last month
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆55Updated 6 months ago
- [ICLR2025] ClassDiffusion: Official impl. of Paper "ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance"☆41Updated 2 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆32Updated last month
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆75Updated last year
- Training code for CLIP-FlanT5☆26Updated 9 months ago
- FQGAN: Factorized Visual Tokenization and Generation☆50Updated last month
- ☆31Updated last year
- ☆42Updated 7 months ago
- ☆17Updated 5 months ago
- MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆18Updated last month
- ☆28Updated 4 months ago
- ☆14Updated 2 weeks ago
- ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning☆31Updated last month
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆83Updated 9 months ago
- Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆82Updated 3 weeks ago
- ☆79Updated last month
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆47Updated 5 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 5 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆40Updated last month
- ICCV2023-Diffusion-Papers☆108Updated last year
- The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".☆12Updated last year
- ☆16Updated last year
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year
- ☆23Updated last month
- [NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression☆55Updated 2 months ago