intentee / paddlerLinks
Open-source LLM load balancer and serving platform for self-hosting LLMs at scale ππ¦
β1,414Updated last week
Alternatives and similar repositories for paddler
Users that are interested in paddler are comparing it to the libraries listed below
Sorting:
- Minimal LLM inference in Rustβ1,026Updated last year
- A high-performance inference engine for AI modelsβ1,404Updated this week
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β385Updated last year
- A cross-platform browser ML framework.β739Updated last year
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ970Updated 3 weeks ago
- VS Code extension for LLM-assisted code/text completionβ1,124Updated last week
- A realtime serving engine for Data-Intensive Generative AI Applicationsβ1,088Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.β624Updated last year
- Super-fast Structured Outputsβ652Updated last month
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.β566Updated last week
- βΎοΈ Helix is a private GenAI stack for building AI agents with declarative pipelines, knowledge (RAG), API bindings, and first-class testiβ¦β703Updated this week
- Things you can do with the token embeddings of an LLMβ1,449Updated last month
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etcβ2,176Updated this week
- llama.cpp fork with additional SOTA quants and improved performanceβ1,494Updated this week
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.β898Updated last month
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.β2,786Updated last month
- Large-scale LLM inference engineβ1,613Updated last week
- Replace OpenAI with Llama.cpp Automagically.β326Updated last year
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.β1,987Updated last week
- Embeddable library or single binary for indexing and searching 1B vectorsβ346Updated last week
- Felafax is building AI infra for non-NVIDIA GPUsβ570Updated 11 months ago
- β434Updated this week
- Fully neural approach for text chunkingβ404Updated 2 months ago
- SeekStorm - sub-millisecond full-text search library & multi-tenancy server in Rustβ1,804Updated this week
- Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Pythonβ¦β1,460Updated 11 months ago
- See Through Your Modelsβ400Updated 6 months ago
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β1,495Updated this week
- High-Performance Implementation of OpenAI's TikToken.β467Updated 6 months ago
- Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rβ¦β539Updated this week
- Faster structured generationβ270Updated this week