distantmagic / paddler
Stateful load balancer custom-tailored for llama.cpp ππ¦
β666Updated last week
Alternatives and similar repositories for paddler:
Users that are interested in paddler are comparing it to the libraries listed below
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β378Updated 10 months ago
- An implementation of bucketMul LLM inferenceβ214Updated 6 months ago
- Finetune llama2-70b and codellama on MacBook Air without quantizationβ448Updated 9 months ago
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β402Updated last week
- Felafax is building AI infra for non-NVIDIA GPUsβ551Updated this week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ738Updated 5 months ago
- The simplest way to build AI workloads on Postgresβ752Updated last month
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient β¦β216Updated last month
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.β525Updated 2 months ago
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Frameworkβ336Updated last month
- Things you can do with the token embeddings of an LLMβ1,411Updated last week
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.β706Updated this week
- Minimal LLM inference in Rustβ958Updated 2 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β519Updated last month
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ485Updated this week
- GGUF implementation in C as a library and a tools CLI programβ251Updated last week
- β163Updated 7 months ago
- Action library for AI Agentβ206Updated this week
- β276Updated 3 weeks ago
- β736Updated 9 months ago
- β Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are complβ¦β295Updated 2 weeks ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.β273Updated 2 weeks ago
- Fast, SQL powered, in-process vector search for any language with an SQLite driverβ277Updated 2 months ago
- Replace OpenAI with Llama.cpp Automagically.β299Updated 7 months ago
- Go library for embedded vector search and semantic embeddings using llama.cppβ373Updated 2 months ago
- FastMLX is a high performance production ready API to host MLX models.β252Updated last month
- Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Pythonβ¦β1,310Updated 4 months ago
- ai for jqβ236Updated 3 months ago
- β664Updated this week
- Lightweight Nearest Neighbors with Flexible Backendsβ194Updated last week