intentee / paddlerLinks
Stateful load balancer custom-tailored for llama.cpp ππ¦
β798Updated last week
Alternatives and similar repositories for paddler
Users that are interested in paddler are comparing it to the libraries listed below
Sorting:
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.β382Updated last year
- Minimal LLM inference in Rustβ1,010Updated 9 months ago
- Finetune llama2-70b and codellama on MacBook Air without quantizationβ448Updated last year
- An implementation of bucketMul LLM inferenceβ221Updated last year
- Fully neural approach for text chunkingβ367Updated 3 months ago
- Replace OpenAI with Llama.cpp Automagically.β321Updated last year
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ781Updated last week
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β579Updated 5 months ago
- Felafax is building AI infra for non-NVIDIA GPUsβ567Updated 6 months ago
- GGUF implementation in C as a library and a tools CLI programβ276Updated 6 months ago
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for varioβ¦β1,020Updated 5 months ago
- βΎοΈ Helix is a private GenAI stack for building AI agents with declarative pipelines, knowledge (RAG), API bindings, and first-class testiβ¦β508Updated this week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ779Updated 11 months ago
- LLM-powered lossless compression toolβ285Updated 11 months ago
- Super-fast Structured Outputsβ342Updated last week
- A fast batching API to serve LLM modelsβ185Updated last year
- Fast, SQL powered, in-process vector search for any language with an SQLite driverβ319Updated 8 months ago
- Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.β952Updated this week
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.β601Updated 9 months ago
- See Through Your Modelsβ398Updated 3 weeks ago
- VS Code extension for LLM-assisted code/text completionβ858Updated last week
- Things you can do with the token embeddings of an LLMβ1,445Updated 4 months ago
- A cross-platform browser ML framework.β709Updated 8 months ago
- β163Updated last year
- β383Updated 2 weeks ago
- Large-scale LLM inference engineβ1,482Updated last week
- Model swapping for llama.cpp (or any local OpenAPI compatible server)β1,088Updated this week
- Visualize the intermediate output of Mistral 7Bβ367Updated 6 months ago
- Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.β451Updated 6 months ago
- High-Performance Implementation of OpenAI's TikToken.β440Updated 3 weeks ago