b4rtaz / distributed-llamaLinks
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
☆2,743Updated 3 weeks ago
Alternatives and similar repositories for distributed-llama
Users that are interested in distributed-llama are comparing it to the libraries listed below
Sorting:
- Large-scale LLM inference engine☆1,596Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,364Updated 3 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆1,329Updated this week
- Local AI API Platform☆2,764Updated 4 months ago
- Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙☆1,373Updated this week
- Reliable model swapping for any local OpenAI compatible server - llama.cpp, vllm, etc☆1,899Updated this week
- NVIDIA Linux open GPU with P2P support☆1,278Updated 5 months ago
- An awesome repository of local AI tools☆1,742Updated last year
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆2,004Updated this week
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆746Updated last week
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆941Updated last month
- Blazingly fast LLM inference.☆6,242Updated this week
- Distributed LLM and StableDiffusion inference for mobile, desktop and server.☆2,893Updated last year
- Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?☆1,832Updated last year
- Llama 2 Everywhere (L2E)☆1,521Updated 2 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,905Updated 2 years ago
- VS Code extension for LLM-assisted code/text completion☆1,062Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,533Updated 6 months ago
- Local realtime voice AI☆2,378Updated 8 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆609Updated 9 months ago
- Implementation for MatMul-free LM.☆3,037Updated 4 months ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,167Updated last year
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,090Updated this week
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆1,542Updated last month
- Effortlessly run LLM backends, APIs, frontends, and services with one command.☆2,154Updated this week
- LocalAGI is a powerful, self-hostable AI Agent platform designed for maximum privacy and flexibility. A complete drop-in replacement for …☆1,368Updated this week
- Big & Small LLMs working together☆1,200Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆621Updated last year
- the terminal client for Ollama☆2,247Updated last month
- Distributed Training Over-The-Internet☆967Updated last month