evilsocket / cake
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
☆2,779Updated 3 months ago
Alternatives and similar repositories for cake:
Users that are interested in cake are comparing it to the libraries listed below
- Blazingly fast LLM inference.☆5,064Updated this week
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆1,851Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆7,506Updated last week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,506Updated this week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,672Updated last month
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆6,185Updated 3 months ago
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆3,635Updated 6 months ago
- Local realtime voice AI☆2,224Updated this week
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,179Updated 3 weeks ago
- Turn any webpage into structured data using LLMs☆3,266Updated 5 months ago
- Making the community's best AI chat models available to everyone.☆1,920Updated 2 weeks ago
- Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.☆3,244Updated this week
- A fast multimodal LLM for real-time voice☆3,589Updated last week
- SGLang is a fast serving framework for large language models and vision language models.☆10,325Updated this week
- Examples using MLX Swift☆1,537Updated last week
- On-device AI across mobile, embedded and edge for PyTorch☆2,517Updated this week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆5,852Updated 3 weeks ago
- A language model programming library.☆5,614Updated this week
- Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.☆2,843Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,564Updated 2 weeks ago
- A vector search SQLite extension that runs anywhere!☆4,883Updated 3 weeks ago
- Turn any glasses into AI-powered smart glasses☆3,495Updated 6 months ago
- Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚☆23,729Updated this week
- tiny vision language model☆7,409Updated 2 weeks ago
- Agentic components of the Llama Stack APIs☆4,140Updated this week
- The Open Source Memory Layer For Autonomous Agents☆2,000Updated 3 months ago
- VS Code extension for LLM-assisted code/text completion☆532Updated this week
- 🔍 AI search engine - self-host with local or cloud LLMs☆3,164Updated 4 months ago
- Examples in the MLX framework☆6,955Updated this week
- Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Llama-3, Langchain, OpenAI, Upstash, Brave & Serper☆4,837Updated 4 months ago