evilsocket / cakeLinks

Distributed LLM and StableDiffusion inference for mobile, desktop and server.

☆2,872

Alternatives and similar repositories for cake

Users that are interested in cake are comparing it to the libraries listed below

Sorting:

EricLBuehler / mistral.rs
Blazingly fast LLM inference.
☆5,849Updated this week
LlamaEdge / LlamaEdge
The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge
☆1,442Updated last week
b4rtaz / distributed-llama
Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
☆2,209Updated last week
kevmo314 / scuda
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.
☆1,740Updated 3 weeks ago
vikhyat / moondream
tiny vision language model
☆8,179Updated 3 weeks ago
asg017 / sqlite-vec
A vector search SQLite extension that runs anywhere!
☆5,858Updated 5 months ago
NousResearch / DisTrO
Distributed Training Over-The-Internet
☆945Updated 2 months ago
exo-explore / exo
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
☆28,953Updated 3 months ago
bklieger-groq / g1
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
☆4,227Updated 5 months ago
pytorch / torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
☆3,597Updated this week
togethercomputer / MoA
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
☆2,774Updated 6 months ago
menloresearch / cortex.cpp
Local AI API Platform
☆2,765Updated last week
huggingface / text-embeddings-inference
A blazing fast inference solution for text embeddings models
☆3,781Updated this week
AnswerDotAI / gpu.cpp
A lightweight library for portable low-level GPU computation using WebGPU.
☆3,877Updated 4 months ago
jafioti / luminal
Deep learning at the speed of light.
☆1,836Updated this week
MLSysOps / MLE-agent
🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…
☆1,313Updated this week
Lizonghang / prima.cpp
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
☆975Updated this week
RahulSChand / gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
☆1,324Updated 7 months ago
huggingface / ratchet
A cross-platform browser ML framework.
☆709Updated 7 months ago
menloresearch / ichigo
Local realtime voice AI
☆2,336Updated 4 months ago
srush / llama2.rs
A fast llama2 decoder in pure Rust.
☆1,051Updated last year
kyutai-labs / moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆8,636Updated this week
agiresearch / AIOS
AIOS: AI Agent Operating System
☆4,358Updated this week
moonshine-ai / moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
☆2,793Updated 2 months ago
PrimeIntellect-ai / prime
prime is a framework for efficient, globally distributed training of AI models over the internet.
☆779Updated last month
filipstrand / mflux
A MLX port of FLUX based on the Huggingface Diffusers implementation.
☆1,472Updated last week
huggingface / candle
Minimalist ML framework for Rust
☆17,592Updated last week
nilsherzig / LLocalSearch
LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…
☆5,934Updated 2 months ago
ridgerchu / matmulfreellm
Implementation for MatMul-free LM.
☆3,016Updated 8 months ago
microsoft / aici
AICI: Prompts as (Wasm) Programs
☆2,039Updated 5 months ago