tiny vision language model
☆9,554Nov 14, 2025Updated 4 months ago
Alternatives and similar repositories for moondream
Users that are interested in moondream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone☆24,322Apr 1, 2026Updated last week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆24,652Aug 12, 2024Updated last year
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,557Apr 3, 2026Updated last week
- Unsloth Studio is a web UI for training and running open models like Qwen3.5, Gemma 4, DeepSeek, gpt-oss locally.☆59,774Updated this week
- We write your reusable computer vision tools. 💜☆37,949Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A natural language interface for computers☆63,040Feb 9, 2026Updated 2 months ago
- Distribute and run LLMs with a single file.☆24,121Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆36,216Apr 19, 2025Updated 11 months ago
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.☆21,988Updated this week
- Build, run, manage agentic software at scale.☆39,343Updated this week
- DSPy: The framework for programming—not prompting—language models☆33,495Apr 2, 2026Updated last week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆42,652Updated this week
- A fast multimodal LLM for real-time voice☆4,396Dec 12, 2025Updated 4 months ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,473Mar 15, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,280Apr 4, 2026Updated last week
- The #1 open-source voice interface for desktop, mobile, and ESP32 chips.☆5,113Nov 1, 2024Updated last year
- LLM inference in C/C++☆103,237Updated this week
- Large Action Model framework to develop AI Web Agents☆6,311Jan 21, 2025Updated last year
- Go ahead and axolotl questions☆11,608Updated this week
- 🙌 OpenHands: AI-Driven Development☆70,666Updated this week
- Foundational model for human-like, expressive TTS☆4,198Jul 30, 2024Updated last year
- Universal memory layer for AI Agents☆52,137Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆75,637Updated this week
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆28,073Sep 30, 2025Updated 6 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,933May 3, 2024Updated last year
- High-performance In-browser LLM Inference Engine☆17,740Updated this week
- 🔊 Text-Prompted Generative Audio Model☆39,076Aug 19, 2024Updated last year
- Automate browser based workflows with AI☆21,068Updated this week
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆48,311Updated this week
- Structured Outputs☆13,631Mar 26, 2026Updated 2 weeks ago
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,667Updated this week
- Run frontier AI locally.☆43,503Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- SOTA Open Source TTS☆29,257Updated this week
- Inference and training library for high-quality TTS models.☆5,560Dec 10, 2024Updated last year
- StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation☆10,684Dec 4, 2024Updated last year
- Large World Model -- Modeling Text and Video with Millions Context☆7,408Oct 19, 2024Updated last year
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,736May 29, 2024Updated last year
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆44,993Aug 16, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.☆25,260Apr 3, 2026Updated last week