ExponentialML / Video-BLIP2-Preprocessor
A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it
☆133Updated last year
Alternatives and similar repositories for Video-BLIP2-Preprocessor:
Users that are interested in Video-BLIP2-Preprocessor are comparing it to the libraries listed below
- EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties☆118Updated 3 months ago
- ☆174Updated 7 months ago
- ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)☆237Updated 7 months ago
- A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.☆117Updated 3 weeks ago
- Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models (ICLR 2024)☆136Updated 9 months ago
- Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'☆102Updated 2 months ago
- A simple magic animate pipeline including densepose inference.☆34Updated last year
- Retrieval-Augmented Video Generation for Telling a Story☆253Updated last year
- Supercharged BLIP-2 that can handle videos☆117Updated last year
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆237Updated 2 months ago
- The HD-VG-130M Dataset☆116Updated 10 months ago
- [AAAI 2025] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion …☆157Updated 10 months ago
- [NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching☆145Updated 3 months ago
- [IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort☆146Updated 2 months ago
- [IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance☆188Updated 11 months ago
- (CVPR 2024) Official code for paper "Towards Language-Driven Video Inpainting via Multimodal Large Language Models"☆89Updated 10 months ago
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆132Updated 4 months ago
- The official implementation for "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising".☆295Updated last year
- Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation☆38Updated last year
- Implementation of long video generation☆78Updated last year
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆398Updated 7 months ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆232Updated 6 months ago
- official implementation of VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning (COLM 2024)☆171Updated 6 months ago
- AnimateDiff I2V version.☆183Updated 11 months ago
- Official Pytorch Implementation for "VidToMe: Video Token Merging for Zero-Shot Video Editing" (CVPR 2024)☆214Updated 3 weeks ago
- [NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"☆126Updated 4 months ago
- [CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts☆285Updated 8 months ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆155Updated 4 months ago
- ☆173Updated 7 months ago
- I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models☆206Updated last year