haoningwu3639 / SimpleSDM-Video
A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.
☆17Updated last year
Alternatives and similar repositories for SimpleSDM-Video:
Users that are interested in SimpleSDM-Video are comparing it to the libraries listed below
- A simple and flexible PyTorch implementation of StableDiffusion based on diffusers.☆22Updated 6 months ago
- A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.☆18Updated 2 months ago
- Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""☆12Updated 8 months ago
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆22Updated 6 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆30Updated 3 months ago
- [NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression☆53Updated last month
- ☆40Updated 6 months ago
- ☆16Updated 3 months ago
- A Simple Plugin for Transforming Images to Arbitrary Scales☆18Updated 2 years ago
- Official PyTorch code of GroundVQA (CVPR'24)☆58Updated 6 months ago
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆80Updated 7 months ago
- ☆16Updated last year
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆23Updated 5 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆33Updated last year
- Official Implementation of VideoDPO☆68Updated 2 months ago
- FQGAN: Factorized Visual Tokenization and Generation☆45Updated 2 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆26Updated last month
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 4 months ago
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆53Updated 5 months ago
- ☆54Updated last year
- [CVPR'25] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆63Updated this week
- [ICML2024]The official implementation of SemiRES in PyTorch.☆24Updated 9 months ago
- This is the official implementation for ControlVAR.☆101Updated 3 months ago
- ☆50Updated 9 months ago
- MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆17Updated 2 weeks ago
- LLMBind: A Unified Modality-Task Integration Framework☆18Updated 9 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆110Updated 3 months ago
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆92Updated 3 weeks ago
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆75Updated last year
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆44Updated 3 months ago