vikhyat / moondreamLinks
tiny vision language model
☆8,560Updated last week
Alternatives and similar repositories for moondream
Users that are interested in moondream are comparing it to the libraries listed below
Sorting:
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,633Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,614Updated 3 weeks ago
- Your image is almost there!☆7,658Updated last year
- Structured Outputs☆12,648Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,398Updated 6 months ago
- Large Action Model framework to develop AI Web Agents☆6,179Updated 8 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,817Updated this week
- ☆3,028Updated last year
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,763Updated last year
- Go ahead and axolotl questions☆10,496Updated last week
- The #1 open-source voice interface for desktop, mobile, and ESP32 chips.☆5,092Updated 11 months ago
- High-speed Large Language Model Serving for Local Deployment☆8,347Updated 2 months ago
- A fast multimodal LLM for real-time voice☆4,204Updated last month
- 4M: Massively Multimodal Masked Modeling☆1,765Updated 4 months ago
- Blazingly fast LLM inference.☆6,124Updated 2 weeks ago
- Local AI API Platform☆2,763Updated 3 months ago
- Clarity AI | AI Image Upscaler & Enhancer - free and open-source Magnific Alternative☆4,779Updated 6 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,623Updated last week
- A vector search SQLite extension that runs anywhere!☆6,210Updated 8 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,960Updated this week
- Everything about the SmolLM and SmolVLM family of models☆3,286Updated 2 weeks ago
- Inference and training library for high-quality TTS models.☆5,426Updated 9 months ago
- CoreNet: A library for training deep neural networks☆7,021Updated last month
- Large World Model -- Modeling Text and Video with Millions Context☆7,348Updated 11 months ago
- AI Browser☆5,579Updated 2 weeks ago
- Foundational model for human-like, expressive TTS☆4,167Updated last year
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,897Updated last month
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,952Updated 5 months ago
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,795Updated 6 months ago
- Accepted as [NeurIPS 2024] Spotlight Presentation Paper☆6,350Updated last year