yukw777 / VideoBLIP
Supercharged BLIP-2 that can handle videos
☆116Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for VideoBLIP
- EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties☆117Updated last week
- [ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper☆126Updated 6 months ago
- [NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"☆108Updated last month
- ☆189Updated 3 months ago
- ☆164Updated 3 months ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆77Updated 7 months ago
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆109Updated last month
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆48Updated this week
- Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'☆94Updated 5 months ago
- ☆210Updated 6 months ago
- ☆119Updated last month
- ☆138Updated this week
- ☆55Updated 6 months ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆139Updated last month
- [ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …☆83Updated 2 weeks ago
- ☆72Updated 5 months ago
- Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]☆90Updated 4 months ago
- [NeurIPS 2024] Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation☆48Updated last week
- Interactive Video Generation via Masked-Diffusion☆68Updated 6 months ago
- [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"☆123Updated last month
- HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing☆74Updated 6 months ago
- ☆126Updated last week
- ☆48Updated last year
- A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.☆98Updated last month
- Code for "DreamEdit: Subject-driven Image Editing" (TMLR2023)☆105Updated 9 months ago
- ☆145Updated 3 weeks ago
- The HD-VG-130M Dataset☆108Updated 7 months ago
- [CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.☆136Updated 4 months ago
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆88Updated last month
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆137Updated 6 months ago