intentee / paddlerLinks

Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙

☆1,379

Alternatives and similar repositories for paddler

Users that are interested in paddler are comparing it to the libraries listed below

Sorting:

samuel-vitorino / lm.rs
Minimal LLM inference in Rust
☆1,025Updated last year
trymirai / uzu
A high-performance inference engine for AI models
☆1,383Updated this week
moritztng / fltr
Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.
☆386Updated last year
huggingface / ratchet
A cross-platform browser ML framework.
☆725Updated last year
ggml-org / llama.vscode
VS Code extension for LLM-assisted code/text completion
☆1,072Updated 2 weeks ago
ngxson / wllama
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
☆942Updated last week
EricLBuehler / candle-vllm
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
☆540Updated this week
tensorlakeai / indexify
A realtime serving engine for Data-Intensive Generative AI Applications
☆1,074Updated last week
SeekStorm / SeekStorm
SeekStorm - sub-millisecond full-text search library & multi-tenancy server in Rust
☆1,782Updated this week
dezoito / ollama-grid-search
A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.
☆878Updated 2 weeks ago
HazyResearch / minions
Big & Small LLMs working together
☆1,211Updated this week
ShelbyJenkins / llm_client
The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes
☆239Updated 4 months ago
mostlygeek / llama-swap
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
☆1,977Updated this week
helixml / helix
♾️ Helix is a private GenAI stack for building AI agents with declarative pipelines, knowledge (RAG), API bindings, and first-class testi…
☆545Updated last week
tensorchord / VectorChord
Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.
☆1,368Updated last week
EricLBuehler / mistral.rs
Blazingly fast LLM inference.
☆6,253Updated last week
utilityai / llama-cpp-rs
☆407Updated this week
b4rtaz / distributed-llama
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
☆2,755Updated last month
Anush008 / fastembed-rs
Rust library for generating vector embeddings, reranking. Re-write of qdrant/fastembed.
☆678Updated last week
bosun-ai / swiftide
Fast, streaming indexing, query, and agentic LLM applications in Rust
☆618Updated last week
StarlightSearch / EmbedAnything
Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀
☆856Updated 3 weeks ago
postgresml / korvus
Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python…
☆1,453Updated 10 months ago
guidance-ai / llguidance
Super-fast Structured Outputs
☆621Updated last week
wilsonzlin / CoreNN
Embeddable library or single binary for indexing and searching 1B vectors
☆337Updated this week
aphrodite-engine / aphrodite-engine
Large-scale LLM inference engine
☆1,600Updated last week
mezbaul-h / june
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
☆785Updated last year
baehyunsol / ragit
git-like rag pipeline
☆250Updated 2 weeks ago
benbrandt / text-splitter
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from R…
☆519Updated last week
Maximilian-Winter / llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …
☆611Updated 9 months ago
lucasjinreal / Kokoros
🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.
☆639Updated last week