dipampaul17 / KVSplitLinks

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

☆353

Alternatives and similar repositories for KVSplit

Users that are interested in KVSplit are comparing it to the libraries listed below

Sorting:

stanford-mast / blast
Browser-LLM Auto-Scaling Technology
☆526Updated this week
mirth / chonky
Fully neural approach for text chunking
☆360Updated 2 months ago
Foreseerr / TScale
☆196Updated last month
metabase / dataset-generator
AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards
☆371Updated this week
Z-Gort / Reservoirs-Lab
☆279Updated 2 weeks ago
janwilmake / efficient-recorder
Attempt to create an Open Source Privacy Focused Rewind.ai Alternative for data capture
☆214Updated 5 months ago
aperoc / toolkami
Minimal AI agent framework that just works with only seven tools.
☆534Updated last month
possibilities / claude-composer
☆603Updated last week
taylorai / aiq
ai for jq
☆243Updated 9 months ago
arc53 / llm-price-compass
This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …
☆221Updated 6 months ago
adenta / fire_red_agent
☆163Updated 3 months ago
TuanKiri / weather-mcp-server
A lightweight Model Context Protocol (MCP) server that enables AI assistants like Claude to retrieve and interpret real-time weather data…
☆222Updated 3 weeks ago
slashml / amd_inference
Docker-based inference engine for AMD GPUs
☆231Updated 8 months ago
agentsea / r1-computer-use
Applying the ideas of Deepseek R1 to computer use
☆214Updated 4 months ago
SureScaleAI / cleverbee
CleverBee - The Open Source Deep Researcher Tool
☆300Updated 2 weeks ago
onkernel / kernel-images
Browsers-as-a-service for automations and web agents
☆331Updated this week
Brandon-c-tech / RAG-logger
RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…
☆220Updated 6 months ago
banagale / FileKitty
GUI for selecting text files for concatenation and submission to LLMs
☆173Updated 2 weeks ago
ggozad / haiku.rag
Retrieval Augmented Generation based on SQLite
☆242Updated this week
vlm-run / vlmrun-cookbook
Examples and guides for using the VLM Run API
☆279Updated 3 weeks ago
coder / agentapi
HTTP API for Claude Code, Goose, Aider, and Codex
☆610Updated this week
ross39 / new_bloom_filter_repo
This repo contains a new way to use bloom filters to do lossless video compression
☆242Updated 3 weeks ago
libriscv / drogon-sandbox
☆131Updated last month
dicroce / hnsw
Heirarchical Navigable Small Worlds
☆97Updated 2 months ago
featureform / enrichmcp
EnrichMCP is a python framework for building data driven MCP servers
☆518Updated last week
matiasmolinas / evolving-agents
Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…
☆437Updated 2 weeks ago
vlm-run / vlmrun-hub
A hub for various industry-specific schemas to be used with VLMs.
☆518Updated last month
boldsoftware / sketch
autonomous software apprentice
☆473Updated this week
hatchet-dev / pickaxe
Build agents that scale with a zero-cost abstraction.
☆434Updated last week
day50-dev / Streamdown
Streaming Markdown parser for tui clis
☆270Updated 3 weeks ago