evilsocket / cake
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
☆2,838Updated 5 months ago
Alternatives and similar repositories for cake:
Users that are interested in cake are comparing it to the libraries listed below
- Blazingly fast LLM inference.☆5,437Updated this week
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,028Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,079Updated this week
- A vector search SQLite extension that runs anywhere!☆5,463Updated 2 months ago
- A fast multimodal LLM for real-time voice☆3,844Updated 2 months ago
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,338Updated this week
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,310Updated 3 months ago
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…☆3,279Updated 5 months ago
- SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.☆1,697Updated last week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,721Updated 3 months ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,114Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,567Updated this week
- Automate browser-based workflows with LLMs and Computer Vision☆13,016Updated this week
- first base model for full-duplex conversational audio☆1,731Updated 3 months ago
- tiny vision language model☆7,796Updated last week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,889Updated 5 months ago
- A self-organizing file system with llama 3☆5,255Updated 2 months ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,208Updated 2 months ago
- Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚☆27,671Updated last month
- SGLang is a fast serving framework for large language models and vision language models.☆13,368Updated this week
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆1,347Updated last week
- Open Source framework for voice and multimodal conversational AI☆5,624Updated this week
- 🔍 AI search engine - self-host with local or cloud LLMs☆3,272Updated 6 months ago
- ☆4,193Updated last month
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,905Updated 7 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆7,461Updated 2 months ago
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,680Updated last month
- Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer…☆1,951Updated this week
- A Datacenter Scale Distributed Inference Serving Framework☆3,764Updated this week
- pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tid…☆2,512Updated this week