distantmagic / paddler
Stateful load balancer custom-tailored for llama.cpp ππ¦
β728Updated this week
Alternatives and similar repositories for paddler:
Users that are interested in paddler are comparing it to the libraries listed below
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β380Updated last year
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ620Updated last week
- transparent proxy server for llama.cpp's server to provide automatic model swappingβ460Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.β566Updated 4 months ago
- Felafax is building AI infra for non-NVIDIA GPUsβ555Updated last month
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β545Updated last month
- GGUF implementation in C as a library and a tools CLI programβ261Updated 2 months ago
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β526Updated this week
- Replace OpenAI with Llama.cpp Automagically.β311Updated 9 months ago
- Things you can do with the token embeddings of an LLMβ1,432Updated last month
- An implementation of bucketMul LLM inferenceβ215Updated 8 months ago
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for varioβ¦β1,009Updated 3 weeks ago
- Finetune llama2-70b and codellama on MacBook Air without quantizationβ448Updated 11 months ago
- A cross-platform browser ML framework.β669Updated 3 months ago
- 𧬠Helix is a private GenAI stack for building AI applications with declarative pipelines, knowledge (RAG), API bindings, and first-classβ¦β472Updated this week
- A fast batching API to serve LLM modelsβ182Updated 10 months ago
- The simplest way to build AI workloads on Postgresβ765Updated this week
- Minimal LLM inference in Rustβ980Updated 4 months ago
- VS Code extension for LLM-assisted code/text completionβ608Updated this week
- LLM Analyticsβ646Updated 5 months ago
- A realtime serving engine for Data-Intensive Generative AI Applicationsβ980Updated this week
- LLM-powered lossless compression toolβ274Updated 7 months ago
- β163Updated 9 months ago
- Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Pythonβ¦β1,343Updated last month
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fullβ¦β601Updated 3 months ago
- LLM-based code completion engineβ181Updated last month
- Dynamically structure language models to produce outputs that adhere to specific requirements without sacrificing their creative capabiliβ¦β118Updated last week
- Fast, SQL powered, in-process vector search for any language with an SQLite driverβ292Updated 4 months ago