vikhyat / moondreamLinks
tiny vision language model
☆8,019Updated last week
Alternatives and similar repositories for moondream
Users that are interested in moondream are comparing it to the libraries listed below
Sorting:
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,174Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,270Updated 2 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆22,643Updated 9 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,319Updated last week
- Official inference repo for FLUX.1 models☆21,848Updated 3 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆17,395Updated this week
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆22,042Updated 2 months ago
- Everything about the SmolLM2 and SmolVLM family of models☆2,442Updated 2 months ago
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,562Updated last year
- Inference and training library for high-quality TTS models.☆5,261Updated 5 months ago
- llama3 implementation one matrix multiplication at a time☆14,982Updated last year
- Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥☆39,558Updated this week
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI☆22,098Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10,709Updated 2 weeks ago
- Python scraper based on AI☆19,817Updated this week
- A fast multimodal LLM for real-time voice☆3,968Updated 3 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,750Updated 9 months ago
- Distribute and run LLMs with a single file.☆22,510Updated 2 weeks ago
- Large World Model -- Modeling Text and Video with Millions Context☆7,277Updated 7 months ago
- Large Action Model framework to develop AI Web Agents☆6,065Updated 4 months ago
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,917Updated last month
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,565Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,586Updated last week
- Letta (formerly MemGPT) is the stateful agents framework with memory, reasoning, and context management.☆16,619Updated last week
- ☆2,952Updated 8 months ago
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,133Updated 2 months ago
- Blazingly fast LLM inference.☆5,644Updated this week
- Open Source framework for voice and multimodal conversational AI☆6,247Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,508Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆23,409Updated this week