Sanster / VLM-demosLinks
Collect VLM models that can be tried online.
☆14Updated last year
Alternatives and similar repositories for VLM-demos
Users that are interested in VLM-demos are comparing it to the libraries listed below
Sorting:
- Auto Thinking Mode switch for Qwen3 in Open webui☆67Updated 3 months ago
- PresentAgent: Multimodal Agent for Presentation Video Generation☆96Updated last month
- ☆46Updated last year
- Run Open Source Local AI Models in Excel with Ollama☆21Updated 3 weeks ago
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆182Updated 5 months ago
- qwen create prompt for sdxl☆34Updated last year
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated 11 months ago
- Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本…☆23Updated last year
- A minimalistic, hackable code base to finetune Wan video generation model☆43Updated 4 months ago
- Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)☆53Updated 3 months ago
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆77Updated last year
- ComfyUI wrapper for Moondream's gaze detection☆55Updated 7 months ago
- ☆25Updated last year
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆64Updated 11 months ago
- Prompt 工程师利器,可同时比较多个 Prompts 在多个 LLM 模型上的效果☆96Updated 2 years ago
- ☆39Updated last week
- Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.☆10Updated last year
- A diffusers pipeline for zero shot stylised couples portrait creation☆101Updated 8 months ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- Exploration of World Languages☆19Updated last year
- ☆29Updated last year
- ImageSlider custom component for gradio.☆42Updated last year
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆48Updated 8 months ago
- Learning records for building a large language model from scratch☆57Updated 8 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆34Updated 6 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆22Updated last year
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆54Updated this week
- ☆24Updated last year
- Gradio app to track objects in video and add visual effects☆17Updated last month