evilsocket / cakeLinks
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
☆2,896Updated last year
Alternatives and similar repositories for cake
Users that are interested in cake are comparing it to the libraries listed below
Sorting:
- Blazingly fast LLM inference.☆6,262Updated this week
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,761Updated last week
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆1,550Updated 2 weeks ago
- SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.☆1,785Updated 5 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆9,166Updated 3 weeks ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,620Updated 3 months ago
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a ch…☆5,956Updated 7 months ago
- CoreNet: A library for training deep neural networks☆7,025Updated 2 months ago
- Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚☆32,790Updated last month
- Fast and accurate automatic speech recognition (ASR) for edge devices☆3,024Updated 3 weeks ago
- A vector search SQLite extension that runs anywhere!☆6,513Updated 10 months ago
- Local AI API Platform☆2,764Updated 5 months ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,220Updated 3 months ago
- Local realtime voice AI☆2,386Updated 2 weeks ago
- GPU cluster manager for optimized AI model deployment☆4,200Updated this week
- Deep learning at the speed of light.☆2,648Updated 3 weeks ago
- A fast llama2 decoder in pure Rust.☆1,056Updated 2 years ago
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,836Updated 11 months ago
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,923Updated 2 months ago
- Bionic is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality☆2,275Updated this week
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆946Updated 2 weeks ago
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,706Updated 5 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,153Updated last week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆3,103Updated 6 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆4,247Updated 7 months ago
- llama and other large language models on iOS and MacOS offline using GGML library.☆1,921Updated 2 months ago
- Making the community's best AI chat models available to everyone.☆1,985Updated 10 months ago
- A cross-platform browser ML framework.☆731Updated last year
- Perplexity Inspired Answer Engine☆5,009Updated 5 months ago
- On-device Speech Recognition for Apple Silicon