vikhyat/moondream

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vikhyat/moondream)

vikhyat / moondream

tiny vision language model

☆9,554

Alternatives and similar repositories for moondream

Users that are interested in moondream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenBMB / MiniCPM-o
View on GitHub
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
☆24,322Apr 1, 2026Updated last week
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,652Aug 12, 2024Updated last year
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆19,557Apr 3, 2026Updated last week
unslothai / unsloth
View on GitHub
Unsloth Studio is a web UI for training and running open models like Qwen3.5, Gemma 4, DeepSeek, gpt-oss locally.
☆59,774Updated this week
roboflow / supervision
View on GitHub
We write your reusable computer vision tools. 💜
☆37,949Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
openinterpreter / open-interpreter
View on GitHub
A natural language interface for computers
☆63,040Feb 9, 2026Updated 2 months ago
mozilla-ai / llamafile
View on GitHub
Distribute and run LLMs with a single file.
☆24,121Updated this week
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆36,216Apr 19, 2025Updated 11 months ago
letta-ai / letta
View on GitHub
Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.
☆21,988Updated this week
agno-agi / agno
View on GitHub
Build, run, manage agentic software at scale.
☆39,343Updated this week
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆33,495Apr 2, 2026Updated last week
BerriAI / litellm
View on GitHub
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…
☆42,652Updated this week
fixie-ai / ultravox
View on GitHub
A fast multimodal LLM for real-time voice
☆4,396Dec 12, 2025Updated 4 months ago
jasonppy / VoiceCraft
View on GitHub
Zero-Shot Speech Editing and Text-to-Speech in the Wild
☆8,473Mar 15, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Lightning-AI / litgpt
View on GitHub
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
☆13,280Apr 4, 2026Updated last week
openinterpreter / 01
View on GitHub
The #1 open-source voice interface for desktop, mobile, and ESP32 chips.
☆5,113Nov 1, 2024Updated last year
ggml-org / llama.cpp
View on GitHub
LLM inference in C/C++
☆103,237Updated this week
lavague-ai / LaVague
View on GitHub
Large Action Model framework to develop AI Web Agents
☆6,311Jan 21, 2025Updated last year
axolotl-ai-cloud / axolotl
View on GitHub
Go ahead and axolotl questions
☆11,608Updated this week
OpenHands / OpenHands
View on GitHub
🙌 OpenHands: AI-Driven Development
☆70,666Updated this week
metavoiceio / metavoice-src
View on GitHub
Foundational model for human-like, expressive TTS
☆4,198Jul 30, 2024Updated last year
mem0ai / mem0
View on GitHub
Universal memory layer for AI Agents
☆52,137Updated this week
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆75,637Updated this week
Wordpress hosting with auto-scaling on Cloudways • Ad
Fully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
stanford-oval / storm
View on GitHub
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
☆28,073Sep 30, 2025Updated 6 months ago
jzhang38 / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆8,933May 3, 2024Updated last year
mlc-ai / web-llm
View on GitHub
High-performance In-browser LLM Inference Engine
☆17,740Updated this week
suno-ai / bark
View on GitHub
🔊 Text-Prompted Generative Audio Model
☆39,076Aug 19, 2024Updated last year
Skyvern-AI / skyvern
View on GitHub
Automate browser based workflows with AI
☆21,068Updated this week
crewAIInc / crewAI
View on GitHub
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…
☆48,311Updated this week
dottxt-ai / outlines
View on GitHub
Structured Outputs
☆13,631Mar 26, 2026Updated 2 weeks ago
roboflow / maestro
View on GitHub
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
☆2,667Updated this week
exo-explore / exo
View on GitHub
Run frontier AI locally.
☆43,503Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆29,257Updated this week
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,560Dec 10, 2024Updated last year
cumulo-autumn / StreamDiffusion
View on GitHub
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
☆10,684Dec 4, 2024Updated last year
LargeWorldModel / LWM
View on GitHub
Large World Model -- Modeling Text and Video with Millions Context
☆7,408Oct 19, 2024Updated last year
zai-org / CogVLM
View on GitHub
a state-of-the-art-level open visual language model | 多模态预训练模型
☆6,736May 29, 2024Updated last year
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆44,993Aug 16, 2024Updated last year
Cinnamon / kotaemon
View on GitHub
An open-source RAG-based tool for chatting with your documents.
☆25,260Apr 3, 2026Updated last week