TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration
☆328Mar 27, 2026Updated this week
Alternatives and similar repositories for turboquant
Users that are interested in turboquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A CLI for managing AI skill packages☆27Jan 18, 2026Updated 2 months ago
- Standalone repo for our Atropos integration with Thinking Machines Tinker API (https://thinkingmachines.ai/tinker/)☆20Mar 22, 2026Updated last week
- Talk to your shell in natural language. Locally.☆55Feb 15, 2026Updated last month
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆18Feb 9, 2026Updated last month
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚☆23Jul 14, 2025Updated 8 months ago
- LEMMA: Logical Engine for Multi-domain Mathematical Analysis☆28Feb 14, 2026Updated last month
- REAP expert pruning for MoE LLMs on Apple Silicon via MLX☆49Mar 16, 2026Updated last week
- ☆13Jan 14, 2026Updated 2 months ago
- ☆23Jul 11, 2025Updated 8 months ago
- A complete end-to-end system that takes mathematical problems and automatically generates polished educational videos☆33Jan 3, 2026Updated 2 months ago
- Official implementation of Categorical Flow Maps on text.☆49Feb 16, 2026Updated last month
- An open-source robotics knowledge base and project library for all skill levels. Includes structured lessons, code examples, and system-l…☆18Mar 15, 2026Updated 2 weeks ago
- A lightweight graphics library for the Elm programming language☆15Jul 15, 2017Updated 8 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆41Feb 14, 2026Updated last month
- ☆16Feb 24, 2026Updated last month
- [ICML 2025] Improving Planning of Agents for Long-Horizon Tasks☆27Oct 2, 2025Updated 5 months ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- monoDrive autonomous vehicle client☆11Aug 14, 2023Updated 2 years ago
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- An online Verilog IDE based on YosysJS.☆24Jan 7, 2016Updated 10 years ago
- ☆18Mar 30, 2023Updated 2 years ago
- This repository contains instructions for fast-forward Raspberry Pi setup with Raspbian without a monitor and keyboard - just a network c…☆16Apr 9, 2014Updated 11 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for my practice.☆10Jul 18, 2018Updated 7 years ago
- JSON saver/loader for TD☆10Aug 19, 2019Updated 6 years ago
- ☆22May 5, 2025Updated 10 months ago
- ☆27Mar 10, 2026Updated 2 weeks ago
- A GUI for Claude Code☆56Mar 2, 2026Updated 3 weeks ago
- ☆20Apr 10, 2025Updated 11 months ago
- Powdered Metal — High performance LLM fine-tuning framework for Apple Silicon☆187Updated this week
- Check Safety of SSH Public Keys☆12Oct 8, 2022Updated 3 years ago
- Simple wrapper around docker-machine tool for creating and managing cloud instances across many providers☆10Sep 5, 2017Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Alternative caching backends for `{memoise}` & `{shiny}`.☆13Mar 27, 2023Updated 3 years ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- Tool that allows users to create their own smart contracts in Polkadot.☆14Jun 3, 2024Updated last year
- Houdini Python Wiki☆18Mar 18, 2024Updated 2 years ago
- Collective and Neighbor Collective Optimizations and Extensions☆13Mar 2, 2026Updated 3 weeks ago
- Mapping workshop for TouchDesigner Summit 2019☆12Aug 19, 2019Updated 6 years ago
- Handy list of network visualisation libraries for R☆12Nov 11, 2019Updated 6 years ago