kjerk / instructblip-pipeline
A multimodal inference pipeline that integrates InstructBLIP with textgen-webui for Vicuna and related models.
☆30Updated last year
Related projects ⓘ
Alternatives and complementary repositories for instructblip-pipeline
- Implementation of "SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing"☆84Updated 10 months ago
- Implementation of the premier Text to Video model from OpenAI☆57Updated last week
- sd3 dreambooth lora training book, adapted from the diffusers doc☆42Updated 5 months ago
- finetune your florence2 model easy☆20Updated 3 months ago
- An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community …☆55Updated this week
- Unofficial implementation. Stable diffusion model trained by AI Feedback-Based Self-Training Direct Preference Optimization.☆59Updated 8 months ago
- ☆25Updated 8 months ago
- Keyframe Interpolation with CogvideoX☆84Updated 3 weeks ago
- faster parallel inference of mochi-1 video generation model☆73Updated last week
- Modern Stable Diffusion models family - Fluently☆26Updated 5 months ago
- A notebook-based web UI for DeepFloyd IF☆24Updated 5 months ago
- ☆30Updated 11 months ago
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆18Updated this week
- Public code release for the paper "ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation"☆34Updated 2 weeks ago
- TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation☆50Updated last month
- [ACM MM24] Official implementation of ACM MM 2024 paper: "ZePo: Zero-Shot Portrait Stylization with Faster Sampling"☆34Updated 3 months ago
- ☆24Updated 5 months ago
- We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.☆11Updated 2 months ago
- Generate images from an initial frame and text☆37Updated last year
- A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it☆131Updated 10 months ago
- Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified …☆64Updated last week
- This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆121Updated 5 months ago
- ☆86Updated 9 months ago
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated 11 months ago
- ☆145Updated 3 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆43Updated 2 weeks ago
- Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'☆94Updated 6 months ago