evilsocket / cakeLinks
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
☆2,854Updated 7 months ago
Alternatives and similar repositories for cake
Users that are interested in cake are comparing it to the libraries listed below
Sorting:
- Blazingly fast LLM inference.☆5,644Updated this week
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,074Updated last month
- Local AI API Platform☆2,715Updated 2 weeks ago
- Manage GPU clusters for running AI models☆2,778Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,356Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,718Updated 3 weeks ago
- SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.☆1,724Updated last month
- tiny vision language model☆8,019Updated last week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,427Updated this week
- The python library for real-time communication☆3,991Updated this week
- A fast multimodal LLM for real-time voice☆3,968Updated 3 months ago
- Open Source framework for voice and multimodal conversational AI☆6,247Updated this week
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,917Updated last month
- llama and other large language models on iOS and MacOS offline using GGML library.☆1,777Updated 2 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆757Updated last week
- A Datacenter Scale Distributed Inference Serving Framework☆4,136Updated this week
- An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.☆1,562Updated 8 months ago
- Low-bit LLM inference on CPU with lookup table☆793Updated this week
- Local realtime voice AI☆2,317Updated 3 months ago
- LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…☆3,262Updated this week
- FlashInfer: Kernel Library for LLM Serving☆3,088Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆14,814Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,928Updated 2 weeks ago
- Tensor library for machine learning☆12,614Updated this week
- Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?☆1,633Updated last year
- Tools for merging pretrained large language models.☆5,754Updated 2 weeks ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,586Updated 2 weeks ago
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆1,402Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆9,200Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,213Updated 3 months ago