intentee / paddlerLinks
Open-source LLM load balancer and serving platform for self-hosting LLMs at scale ππ¦
β1,379Updated last week
Alternatives and similar repositories for paddler
Users that are interested in paddler are comparing it to the libraries listed below
Sorting:
- Minimal LLM inference in Rustβ1,025Updated last year
- A high-performance inference engine for AI modelsβ1,383Updated this week
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β386Updated last year
- A cross-platform browser ML framework.β725Updated last year
- VS Code extension for LLM-assisted code/text completionβ1,072Updated 2 weeks ago
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ942Updated last week
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.β540Updated this week
- A realtime serving engine for Data-Intensive Generative AI Applicationsβ1,074Updated last week
- SeekStorm - sub-millisecond full-text search library & multi-tenancy server in Rustβ1,782Updated this week
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.β878Updated 2 weeks ago
- Big & Small LLMs working togetherβ1,211Updated this week
- The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibesβ239Updated 4 months ago
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etcβ1,977Updated this week
- βΎοΈ Helix is a private GenAI stack for building AI agents with declarative pipelines, knowledge (RAG), API bindings, and first-class testiβ¦β545Updated last week
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β1,368Updated last week
- Blazingly fast LLM inference.β6,253Updated last week
- β407Updated this week
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.β2,755Updated last month
- Rust library for generating vector embeddings, reranking. Re-write of qdrant/fastembed.β678Updated last week
- Fast, streaming indexing, query, and agentic LLM applications in Rustβ618Updated last week
- Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust π¦β856Updated 3 weeks ago
- Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Pythonβ¦β1,453Updated 10 months ago
- Super-fast Structured Outputsβ621Updated last week
- Embeddable library or single binary for indexing and searching 1B vectorsβ337Updated this week
- Large-scale LLM inference engine