Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global cache management, inference simulation(HiSim), and more.
☆159Apr 30, 2026Updated last week
Alternatives and similar repositories for tair-kvcache
Users that are interested in tair-kvcache are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- a simple API to use CUPTI☆10Aug 19, 2025Updated 8 months ago
- Agent skills for vLLM☆67Apr 3, 2026Updated last month
- ☆67Feb 5, 2026Updated 3 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆97Jan 16, 2026Updated 3 months ago
- ☆66Apr 26, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- An experimental communicating attention kernel based on DeepEP.☆34Jul 29, 2025Updated 9 months ago
- An event-driven c library open source by taobao originally and maintain here☆21Mar 15, 2020Updated 6 years ago
- High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph☆55Jul 3, 2022Updated 3 years ago
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- Accelerated Computer Vision Lab (ACCV-Lab) is a systematic collection of packages with the common goal to facilitate end-to-end efficient…☆58Updated this week
- ☆26Oct 2, 2023Updated 2 years ago
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆34Aug 13, 2025Updated 8 months ago
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆181Feb 11, 2026Updated 2 months ago
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆125Dec 25, 2025Updated 4 months ago
- ☆362Jan 28, 2026Updated 3 months ago
- FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)☆70Updated this week
- ☆52May 19, 2025Updated 11 months ago
- ☆16Nov 14, 2023Updated 2 years ago
- ☆13Jun 29, 2024Updated last year
- HPC Game Platform☆11Apr 20, 2023Updated 3 years ago
- GNU Gzip with Kunpeng optimization.☆12Mar 30, 2022Updated 4 years ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆109Jun 28, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Collection of CMake toolchain files and scripts for cross-platform build and CI testing (GCC, Visual Studio, iOS, Android, Clang analyz…☆14May 24, 2024Updated last year
- Nsq v1.1.0 版本的源码分析☆14Aug 9, 2020Updated 5 years ago
- clickhouse-copier (obsolete)☆15Mar 17, 2024Updated 2 years ago
- 我陈平安,唯有一键,可搬山,倒海,降妖,镇魔,敕神,摘星,断江,摧城,开天!☆22Jun 4, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 3 years ago
- Depict GPU memory footprint during DNN training of PyTorch☆11Nov 17, 2022Updated 3 years ago
- IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse☆96Mar 14, 2026Updated last month
- A flexible C++ formatting library designed for i18n, using embedded script to output plural forms, grammatical gender, etc. correctly☆11May 3, 2026Updated last week
- a high-performance, large-capacity, multi-tenant, data-persistent, strong data consistency based on raft, Redis-compatible elastic KV dat…☆55May 1, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆44Oct 15, 2025Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆43Apr 27, 2026Updated last week
- ☆14Oct 8, 2023Updated 2 years ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- 基于EventLoop和多线程的morden cpp 的linux网络库☆11Apr 5, 2020Updated 6 years ago
- Development area for another repo: Learn_Bluespec_and_RISCV_Design☆13Nov 10, 2025Updated 5 months ago
- A tool to detect which version of Redis your Redis-Like database is compatible with.☆42Updated this week