dipampaul17 / KVSplitLinks
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
☆353Updated last month
Alternatives and similar repositories for KVSplit
Users that are interested in KVSplit are comparing it to the libraries listed below
Sorting:
- Browser-LLM Auto-Scaling Technology☆526Updated this week
- Fully neural approach for text chunking☆360Updated 2 months ago
- ☆196Updated last month
- AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards☆371Updated this week
- ☆279Updated 2 weeks ago
- Attempt to create an Open Source Privacy Focused Rewind.ai Alternative for data capture☆214Updated 5 months ago
- Minimal AI agent framework that just works with only seven tools.☆534Updated last month
- ☆603Updated last week
- ai for jq☆243Updated 9 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆221Updated 6 months ago
- ☆163Updated 3 months ago
- A lightweight Model Context Protocol (MCP) server that enables AI assistants like Claude to retrieve and interpret real-time weather data…☆222Updated 3 weeks ago
- Docker-based inference engine for AMD GPUs☆231Updated 8 months ago
- Applying the ideas of Deepseek R1 to computer use☆214Updated 4 months ago
- CleverBee - The Open Source Deep Researcher Tool☆300Updated 2 weeks ago
- Browsers-as-a-service for automations and web agents☆331Updated this week
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆220Updated 6 months ago
- GUI for selecting text files for concatenation and submission to LLMs☆173Updated 2 weeks ago
- Retrieval Augmented Generation based on SQLite☆242Updated this week
- Examples and guides for using the VLM Run API☆279Updated 3 weeks ago
- HTTP API for Claude Code, Goose, Aider, and Codex☆610Updated this week
- This repo contains a new way to use bloom filters to do lossless video compression☆242Updated 3 weeks ago
- ☆131Updated last month
- Heirarchical Navigable Small Worlds☆97Updated 2 months ago
- EnrichMCP is a python framework for building data driven MCP servers☆518Updated last week
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆437Updated 2 weeks ago
- A hub for various industry-specific schemas to be used with VLMs.☆518Updated last month
- autonomous software apprentice☆473Updated this week
- Build agents that scale with a zero-cost abstraction.☆434Updated last week
- Streaming Markdown parser for tui clis☆270Updated 3 weeks ago