Sanster / VLM-demosLinks
Collect VLM models that can be tried online.
☆14Updated last year
Alternatives and similar repositories for VLM-demos
Users that are interested in VLM-demos are comparing it to the libraries listed below
Sorting:
- ☆47Updated last year
- Auto Thinking Mode switch for Qwen3 in Open webui☆70Updated 8 months ago
- Run Open Source Local AI Models in Excel with Ollama☆23Updated 5 months ago
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆197Updated 2 weeks ago
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆121Updated last month
- Prompt 工程师利器,可同时比较多个 Prompts 在多个 LLM 模型上的效果☆96Updated 2 years ago
- Tencent Hunyuan 7B (short as Hunyuan-7B) is one of the large language dense models of Tencent Hunyuan☆70Updated 4 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆33Updated 11 months ago
- Qwen-TTS offers a robust voice synthesis service using FastAPI, supporting bilingual and dialect options. Explore seamless audio generati…☆108Updated this week
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- qwen create prompt for sdxl☆34Updated 2 years ago
- ☆45Updated 4 months ago
- A minimalistic, hackable code base to finetune Wan video generation model☆49Updated 8 months ago
- ☆26Updated last year
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integrat…☆66Updated last year
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆48Updated last year
- ☆25Updated last year
- Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)☆58Updated 7 months ago
- ComfyUI wrapper for Moondream's gaze detection☆55Updated 11 months ago
- Mission intent compiler and autonomy supervisor for unmanned systems.☆144Updated 3 weeks ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆22Updated last year
- Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D…☆37Updated 11 months ago
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- ☆29Updated 2 years ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆56Updated this week
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.☆30Updated last month
- ☆16Updated 5 months ago
- ☆71Updated last month
- ☆18Updated 8 months ago
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Updated last year