aibrix/PrisKV

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/aibrix/PrisKV)

aibrix / PrisKV

High Performance KV Cache Store for LLM

☆59

Alternatives and similar repositories for PrisKV

Users that are interested in PrisKV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xiaguan / pegaflow
View on GitHub
PegaFlow is a high-performance KV cache offloading solution for vLLM v1 on single-node multi-GPU setups.
☆25Jan 7, 2026Updated 6 months ago
bytedance / InfiniStore
View on GitHub
KV cache store for distributed LLM inference
☆425Nov 13, 2025Updated 8 months ago
taco-project / FlexKV
View on GitHub
☆304Updated this week
tair-opensource / resp-benchmark
View on GitHub
resp-benchmark is a benchmark tool for testing databases that support the RESP protocol, such as Redis, Valkey, and Tair.
☆27Jun 17, 2026Updated last month
novitalabs / pegaflow
View on GitHub
High-performance KV cache storage for LLM inference — GPU offloading, SSD caching, and cross-node sharing via RDMA. Works with vLLM and S…
☆179Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bojieli / SocksDirect
View on GitHub
SocksDirect code repository
☆20May 6, 2026Updated 2 months ago
DeepLink-org / DLSlime
View on GitHub
Composable and Embeddable Communication Runtime for Distributed AI Services
☆102Jun 5, 2026Updated last month
jagmarques / nexusquant
View on GitHub
Training-free KV cache compression via E8 lattice VQ. 2-bit KV that preserves retrieval (30/30 NIAH vs TurboQuant 0/30). Calibration-free…
☆26Updated this week
alibaba / tair-kvcache
View on GitHub
Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global cache management, inference simulation(HiSi…
☆215Updated this week
leiysky / tigraph
View on GitHub
☆10Jan 18, 2021Updated 5 years ago
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated 11 months ago
romitjain / kachua-mlsys
View on GitHub
[MLSys 26] 🥇 Solution for Gated Delta Net Track of MLSys 26 Flash infer competition
☆35May 22, 2026Updated last month
microsoft / amlFilesystem-lustre
View on GitHub
Lustre Repository with MS patches
☆16Updated this week
Terra-Flux / PolyRL
View on GitHub
[NSDI'26] PolyRL is a reinforcement learning framework for LLM that harvest spot instances on the cloud to reduce cost.
☆19Mar 30, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
sgl-project / DeepGEMM
View on GitHub
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
☆32Updated this week
kvcache-ai / TrEnv-X
View on GitHub
☆95Sep 15, 2025Updated 10 months ago
ProjectMitosisOS / mitosis-core
View on GitHub
An OS kernel module for fast **remote** fork using advanced datacenter networking (RDMA).
☆72Feb 15, 2025Updated last year
0xWelt / VibeRL
View on GitHub
VibeRL is a Reinforcement Learning framework built essentially through vibe coding with Kimi K2.
☆17Jul 13, 2026Updated last week
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 7 months ago
zartbot / gfd
View on GitHub
GPU Functional Descriptor for memory access
☆34May 24, 2026Updated last month
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆195Feb 11, 2026Updated 5 months ago
Cray / lustre
View on GitHub
Cray Lustre is HPE's curated Lustre distro for HPE ClusterStor, Cray EX, and other HPE/Cray clients
☆18Updated this week
ai-dynamo / nixl
View on GitHub
NVIDIA Inference Xfer Library (NIXL)
☆1,139Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
falcon-infra / falconfs
View on GitHub
FalconFS is a high-performance distributed file system (DFS) designed for AI workloads.
☆65May 19, 2026Updated 2 months ago
inclusionAI / AState
View on GitHub
☆41Dec 9, 2025Updated 7 months ago
dmemsys / Aceso
View on GitHub
This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …
☆24Oct 20, 2024Updated last year
xlite-dev / qwen-image-fast
View on GitHub
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
☆17Oct 24, 2025Updated 8 months ago
wonglkd / BCacheSim
View on GitHub
Cache Simulator specialized for flash caching for bulk storage systems)
☆13Jan 16, 2024Updated 2 years ago
cosmoss-jigu / tips
View on GitHub
☆15Mar 31, 2022Updated 4 years ago
uccl-project / mKernel
View on GitHub
mKernel: fast multi-node, multi-GPU fused kernels
☆251Jun 21, 2026Updated 3 weeks ago
uccl-project / rdmatop
View on GitHub
htop-like TUI for real-time RDMA network monitoring.
☆76Jul 12, 2026Updated last week
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆10Aug 19, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
infinigence / HamiltonAttention
View on GitHub
☆45Oct 15, 2025Updated 9 months ago
Oneflow-Inc / dfccl
View on GitHub
☆26Feb 17, 2025Updated last year
Thesys-lab / cacheWorkloadAnalysisOSDI20
View on GitHub
☆16Aug 11, 2021Updated 4 years ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,925Updated this week
THUDM / IndexCache
View on GitHub
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
☆127Mar 14, 2026Updated 4 months ago
madsys-dev / smart
View on GitHub
Scaling Up Memory Disaggregated Applications with SMART
☆35Apr 23, 2024Updated 2 years ago
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 5 months ago