Open-source LLM/VLM load balancer and serving platform for self-hosting LLMs (and VLMs) at scale ππ¦ Alternative to projects like llm-d, Docker Model Runner, etc but with less moving parts and simple deployments built around ggml ecosystem. Runs on CPU and GPU.
β1,607Jun 16, 2026Updated 2 weeks ago
Alternatives and similar repositories for paddler
Users that are interested in paddler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cppβ45May 16, 2024Updated 2 years ago
- Fast, flexible LLM inferenceβ7,362Updated this week
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.β2,964Apr 14, 2026Updated 2 months ago
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etcβ4,814Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUsβ4,567Mar 4, 2026Updated 3 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Distribute and run LLMs with a single file.β25,105Updated this week
- llama.cpp fork with additional SOTA quants and improved performanceβ2,804Updated this week
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β647Mar 9, 2026Updated 3 months ago
- Large-scale LLM inference engineβ1,771May 8, 2026Updated last month
- Minimalist ML framework for Rustβ20,562Updated this week
- Go ahead and axolotl questionsβ12,082Updated this week
- Plano is an AI-native proxy and data plane for agentic apps β with built-in orchestration, safety, observability, and smart LLM routing sβ¦β6,604Updated this week
- A vector search SQLite extension that runs anywhere!β7,790May 18, 2026Updated last month
- [Unmaintained, see README] An ecosystem of Rust libraries for working with large language modelsβ6,153Jun 24, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- π‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflowsβ12,683Jun 22, 2026Updated last week
- Structured Outputsβ14,273Updated this week
- β136May 26, 2026Updated last month
- Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++β6,398Updated this week
- β596Jun 24, 2026Updated last week
- Local AI API Platformβ2,755Jul 4, 2025Updated 11 months ago
- An async actor framework for Rustβ65Apr 23, 2026Updated 2 months ago
- LLM inference in C/C++β118,422Updated this week
- Minimal LLM inference in Rustβ1,036Oct 24, 2024Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.β15,487Updated this week
- Official Rust Implementation of Model2Vecβ194May 24, 2026Updated last month
- Tensor library for machine learningβ14,871Jun 19, 2026Updated last week
- The official API server for Exllama. OAI compatible, lightweight, and fast.β1,261Updated this week
- Chat language model that can use tools and interpret the resultsβ1,596Dec 3, 2025Updated 6 months ago
- Tools for merging pretrained large language models.β7,190Jun 17, 2026Updated last week
- Python bindings for llama.cppβ10,446Updated this week
- Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Managemeβ¦β2,553Updated this week
- β15Apr 26, 2025Updated last year
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Optimizing inference proxy for LLMsβ4,167May 7, 2026Updated last month
- High-level, optionally asynchronous Rust bindings to llama.cppβ246Jun 5, 2024Updated 2 years ago
- Cleanai (https://github.com/willmil11/cleanai) except I'm making it in c now. Fast and clean from the start this time :)β15Jun 16, 2026Updated 2 weeks ago
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ1,126Jun 17, 2026Updated last week
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the clβ¦β32,575Jun 23, 2026Updated last week
- Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data veβ¦β6,727Updated this week
- Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuildβ3,690Updated this week