Sanster / VLM-demosLinks
Collect VLM models that can be tried online.
☆14Updated last year
Alternatives and similar repositories for VLM-demos
Users that are interested in VLM-demos are comparing it to the libraries listed below
Sorting:
- Auto Thinking Mode switch for Qwen3 in Open webui☆68Updated 4 months ago
- ☆46Updated last year
- Qwen-TTS offers a robust voice synthesis service using FastAPI, supporting bilingual and dialect options. Explore seamless audio generati…☆65Updated this week
- Run Open Source Local AI Models in Excel with Ollama☆22Updated last month
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆185Updated 7 months ago
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆100Updated 3 weeks ago
- Video generation via code☆56Updated this week
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆21Updated last year
- Incredibly descriptive audiovisual summaries for videos☆41Updated last year
- Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)☆56Updated 4 months ago
- An AI agent to control drones from your CLI☆133Updated last month
- Official Repo For THE Paper “StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback”☆19Updated last month
- qwen create prompt for sdxl☆34Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- ☆25Updated last year
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆13Updated last year
- ☆42Updated last month
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Updated last year
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.☆31Updated 2 weeks ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- A lightweight script for processing HTML page to markdown format with support for code blocks☆80Updated last year
- ComfyUI wrapper for Moondream's gaze detection☆55Updated 8 months ago
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆76Updated last year
- ☆149Updated 2 weeks ago