A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.
☆19Feb 15, 2024Updated 2 years ago
Alternatives and similar repositories for SimpleSDM-Video
Users that are interested in SimpleSDM-Video are comparing it to the libraries listed below
Sorting:
- A simple and flexible PyTorch implementation of StableDiffusion based on diffusers.☆24Sep 23, 2024Updated last year
- [BMVC 2023 Oral] Boost Video Frame Interpolation via Motion Adaptation☆18Aug 22, 2024Updated last year
- A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.☆26May 28, 2025Updated 9 months ago
- A simple and flexible PyTorch implementation of StableDiffusion-XL based on diffusers.☆19Sep 2, 2024Updated last year
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆33May 27, 2025Updated 9 months ago
- [ICCV 2025] MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆38Sep 26, 2025Updated 5 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- ☆27Jul 18, 2025Updated 7 months ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆55Nov 26, 2024Updated last year
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆46Nov 25, 2025Updated 3 months ago
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆24Aug 8, 2025Updated 6 months ago
- Official Code for paper "FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction" (RecSys …☆18Jul 23, 2024Updated last year
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆95Jan 2, 2025Updated last year
- [EMNLP 2024] RaTEScore: A Metric for Radiology Report Generation☆63May 18, 2025Updated 9 months ago
- Code implementation of RP3D-Diag☆17Nov 25, 2024Updated last year
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆84Dec 24, 2025Updated 2 months ago
- The official codes for "AutoRG-Brain: Grounded Report Generation for Brain MRI".☆49Jan 6, 2026Updated last month
- ☆28Dec 19, 2025Updated 2 months ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆54May 25, 2025Updated 9 months ago
- [ECCV 2024 Oral] Knowledge-enhanced pretraining for computational pathology☆47Oct 1, 2025Updated 5 months ago
- [CVPR 2026] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence☆63Jul 9, 2025Updated 7 months ago
- This is the offical repository of LLAVIDAL☆23Oct 4, 2025Updated 5 months ago
- Recent Advances on MLLM's Reasoning Ability☆26Apr 11, 2025Updated 10 months ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆30Apr 27, 2024Updated last year
- The official codes for "M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging"☆35Jul 28, 2025Updated 7 months ago
- [ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval☆104Nov 4, 2025Updated 4 months ago
- Code implementation of RP3D-Diag☆78Aug 29, 2025Updated 6 months ago
- Guide to build FFmpeg from source with Netflix's libvmaf on Ubuntu 18.04☆11Oct 12, 2020Updated 5 years ago
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆178Jul 7, 2025Updated 7 months ago
- ☆11Oct 25, 2023Updated 2 years ago
- ☆12Jun 5, 2019Updated 6 years ago
- ☆11Dec 6, 2024Updated last year
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.☆103Nov 6, 2024Updated last year
- [NeurIPS2022] Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop☆14Apr 13, 2023Updated 2 years ago
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 3 months ago
- Generative Models for Low Rank Video Representation and Reconstruction☆10May 20, 2019Updated 6 years ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- ICME'19: Removing Rain in Videos: A Large-scale Database and A Two-stream ConvLSTM Approach☆12Jul 4, 2022Updated 3 years ago
- [TCSVT'24] Offical Implementation of 2AFC-LMMs☆12Aug 17, 2024Updated last year