fyabc / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆16Updated last week
Related projects: ⓘ
- A multi-modal AI Model that can generate high quality novel videos with text, images, or video clips.☆64Updated last year
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆124Updated 3 months ago
- ☆79Updated this week
- ☆30Updated 9 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆184Updated 2 months ago
- A pipeline parallel training script for LLMs.☆79Updated 3 weeks ago
- ☆78Updated 9 months ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆53Updated last month
- sd3 dreambooth lora training book, adapted from the diffusers doc☆40Updated 3 months ago
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆128Updated 9 months ago
- A multimodal inference pipeline that integrates InstructBLIP with textgen-webui for Vicuna and related models.☆30Updated last year
- ☆188Updated 8 months ago
- ☆78Updated 8 months ago
- ☆161Updated 2 months ago
- VimTS: A Unified Video and Image Text Spotter☆72Updated 3 months ago
- ☆74Updated 5 months ago
- VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024☆255Updated 5 months ago
- This repository implements the idea of "caption upsampling" from DALL-E 3 with Zephyr-7B and gathers results with SDXL.☆149Updated 10 months ago
- MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation☆180Updated 2 months ago
- A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v …☆123Updated last week
- Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.☆48Updated 3 months ago
- Instruct-tune LLaMA on consumer hardware☆73Updated last year
- Community ComfyUI workflows running on fal.ai☆53Updated 2 weeks ago
- An initiative to replicate Sora☆98Updated 5 months ago
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆92Updated 3 weeks ago
- ☆50Updated 3 months ago
- a family of highly capabale yet efficient large multimodal models☆155Updated 3 weeks ago
- Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.☆152Updated last month
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆35Updated last week
- FuseAI Project☆75Updated 3 weeks ago