dipampaul17 / KVSplitLinks
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
☆362Updated 7 months ago
Alternatives and similar repositories for KVSplit
Users that are interested in KVSplit are comparing it to the libraries listed below
Sorting:
- Add object detection, tracking, mobile notifications, and search to any security camera.☆505Updated this week
- High-Performance Implementation of OpenAI's TikToken.☆465Updated 5 months ago
- Browser-LLM Auto-Scaling Technology☆768Updated last week
- Attempt to create an Open Source Privacy Focused Rewind.ai Alternative that is a POD (Personal Online Datastore)☆227Updated 3 months ago
- ☆199Updated 7 months ago
- CLI app- Give it a YouTube URL and you get a transcription with possible speaker identification and optional summary or translation, all …☆330Updated 2 weeks ago
- ☆280Updated 6 months ago
- state of the art browsing agent (WebArena 72.7%)☆361Updated 2 months ago
- Persistent memory for LLMs and apps. Content-addressed storage with dedupe, compression, full-text and vector search.☆358Updated this week
- Applying the ideas of Deepseek R1 to computer use☆220Updated 10 months ago
- Fully neural approach for text chunking☆404Updated 2 months ago
- ☆164Updated 9 months ago
- CleverBee - The Open Source Deep Researcher Tool☆309Updated 6 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated last year
- Examples and guides for using the VLM Run API☆302Updated this week
- Git Based Memory Storage for Conversational AI Agent☆758Updated last month
- Docker-based inference engine for AMD GPUs☆231Updated last year
- Physical AI Assistant that illuminates your life☆191Updated 2 months ago
- Multimodal RAG to search and interact locally with technical documents of any kind☆284Updated last month
- This is the public release of MIRA OS. Discrete memories decay through momentum loss, tools auto-configure when dropped into tools/ folde…☆323Updated this week
- ai for jq☆249Updated last year
- An open-source framework for verifiably private AI inference☆895Updated 2 weeks ago
- A hub for various industry-specific schemas to be used with VLMs.☆537Updated 2 weeks ago
- GUI for selecting text files for concatenation and submission to LLMs☆181Updated last month
- Fact Graph☆375Updated 2 months ago
- A secure local sandbox to run LLM-generated code using Apple containers☆690Updated 3 weeks ago
- Chat UI for Coderunner☆192Updated 4 months ago
- Animating R1's thoughts.☆385Updated 10 months ago
- Run and explore Llama models locally with minimal dependencies on CPU☆190Updated last year
- A comprehensive suite of tools, built to liberate science by making the creation, evaluation, and dissemination of research more transpar…☆228Updated 4 months ago