intentee / paddlerLinks
Open-source LLM load balancer and serving platform for self-hosting LLMs at scale ππ¦
β1,430Updated 3 weeks ago
Alternatives and similar repositories for paddler
Users that are interested in paddler are comparing it to the libraries listed below
Sorting:
- Minimal LLM inference in Rustβ1,029Updated last year
- A cross-platform browser ML framework.β744Updated last year
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β385Updated last year
- A high-performance inference engine for AI modelsβ1,419Updated this week
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.β587Updated last week
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ987Updated last month
- VS Code extension for LLM-assisted code/text completionβ1,139Updated 2 weeks ago
- SeekStorm - sub-millisecond full-text search library & multi-tenancy server in Rustβ1,823Updated this week
- βΎοΈ Helix is a private GenAI stack for building AI agents with declarative pipelines, knowledge (RAG), API bindings, and first-class testiβ¦β712Updated this week
- A realtime serving engine for Data-Intensive Generative AI Applicationsβ1,094Updated last week
- Large-scale LLM inference engineβ1,641Updated 2 weeks ago
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etcβ2,311Updated this week
- Super-fast Structured Outputsβ670Updated last week
- β453Updated this week
- Fast, flexible LLM inferenceβ6,449Updated last week
- Rust library for vector embeddings and reranking.β748Updated 2 weeks ago
- The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibesβ242Updated 6 months ago
- Embeddable library or single binary for indexing and searching 1B vectorsβ366Updated last month
- Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rβ¦β551Updated this week
- Felafax is building AI infra for non-NVIDIA GPUsβ570Updated last year
- A Pure Rust based LLM, VLM, VLA, TTS, OCR Inference Engine, powering by Candle & Rust. Alternate to your llama.cpp but much more simpler β¦β241Updated last week
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.β2,815Updated 2 weeks ago
- Fully neural approach for text chunkingβ406Updated 3 months ago
- Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets β¦β1,084Updated this week
- llama.cpp fork with additional SOTA quants and improved performanceβ1,587Updated this week
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β1,527Updated last week
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β615Updated 11 months ago
- Fast, streaming indexing, query, and agentic LLM applications in Rustβ653Updated last week
- Big & Small LLMs working togetherβ1,258Updated this week
- Things you can do with the token embeddings of an LLMβ1,452Updated 2 months ago