kjerk / instructblip-pipeline
A multimodal inference pipeline that integrates InstructBLIP with textgen-webui for Vicuna and related models.
☆30Updated last year
Related projects: ⓘ
- Implementation of "SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing"☆84Updated 8 months ago
- Code release for AccDiffusion (ECCV 2024)☆63Updated last month
- TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation☆48Updated 3 months ago
- A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it☆130Updated 7 months ago
- This repository implements the idea of "caption upsampling" from DALL-E 3 with Zephyr-7B and gathers results with SDXL.☆149Updated 10 months ago
- ☆78Updated 3 weeks ago
- Unofficial implementation. Stable diffusion model trained by AI Feedback-Based Self-Training Direct Preference Optimization.☆59Updated 6 months ago
- finetune your florence2 model easy☆20Updated last month
- Code repository for T2V-Turbo☆166Updated 2 months ago
- official implementation of VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning (COLM 2024)☆162Updated last month
- ☆85Updated 7 months ago
- A text-to-video model that uses past frames for conditioning, enabling the generation of infinite-length videos.☆19Updated 2 weeks ago
- Fine-tuning code for CLIP models☆120Updated this week
- [inactive] MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation☆13Updated 4 months ago
- Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.☆152Updated last month
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆124Updated 3 months ago
- Official Implementation of weights2weights☆98Updated last week
- Modern Stable Diffusion models family - Fluently☆25Updated 3 months ago
- Scripts for use with LongCLIP, including fine-tuning Long-CLIP☆48Updated last month
- sd3 dreambooth lora training book, adapted from the diffusers doc☆40Updated 3 months ago
- ☆98Updated last week
- Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆50Updated 3 months ago
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆21Updated last week
- A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)☆144Updated 2 weeks ago
- SSD-1B, an open-source text-to-image model, outperforming previous versions by being 50% smaller and 60% faster than SDXL.☆164Updated 5 months ago
- Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified …☆58Updated 5 months ago
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆125Updated 7 months ago
- Official implementation for "pOps: Photo-Inspired Diffusion Operators"☆70Updated last month
- [Arxiv 2024] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion…☆139Updated 5 months ago
- Fine-Grained Subject-Specific Attribute Expression Control in T2I Models☆102Updated 3 months ago