vikhyat / moondreamLinks
tiny vision language model
☆8,245Updated last month
Alternatives and similar repositories for moondream
Users that are interested in moondream are comparing it to the libraries listed below
Sorting:
- Large Action Model framework to develop AI Web Agents☆6,101Updated 6 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,599Updated 3 weeks ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,357Updated 4 months ago
- Everything about the SmolLM and SmolVLM family of models☆3,032Updated this week
- ☆2,990Updated 10 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,701Updated this week
- Inference and training library for high-quality TTS models.☆5,370Updated 7 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,667Updated last year
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,068Updated last week
- A fast multimodal LLM for real-time voice☆4,108Updated 3 weeks ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,564Updated last week
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,411Updated 7 months ago
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,604Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆23,180Updated 11 months ago
- Foundational model for human-like, expressive TTS☆4,142Updated last year
- Local realtime voice AI☆2,345Updated 4 months ago
- Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"☆3,302Updated last year
- Blazingly fast LLM inference.☆5,923Updated this week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆17,882Updated this week
- A vector search SQLite extension that runs anywhere!☆5,922Updated 6 months ago
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI☆23,322Updated last week
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,626Updated last year
- The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.☆3,427Updated this week
- Tools for merging pretrained large language models.☆6,122Updated this week
- Modeling, training, eval, and inference code for OLMo☆5,822Updated last week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,962Updated 2 months ago
- Structured Outputs☆12,188Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,587Updated 2 months ago
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,300Updated 4 months ago
- Ollama Python library☆8,146Updated last week