LMCache on Ascend
☆70May 11, 2026Updated last week
Alternatives and similar repositories for LMCache-Ascend
Users that are interested in LMCache-Ascend are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Systematic and comprehensive benchmarks for LLM systems.☆57Jan 28, 2026Updated 3 months ago
- The Intelligent Inference Scheduler for Large-scale Inference Services.☆68Feb 12, 2026Updated 3 months ago
- Community maintained hardware plugin for vLLM on Ascend☆2,077Updated this week
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆431Apr 23, 2026Updated 3 weeks ago
- A simple tool for parsing the profile.json file of mxnet☆14Aug 1, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SGLang kernel library for NPU☆130Updated this week
- An Example of MXNet Models Comilation and Deployment with NNVM in C++☆16Apr 25, 2018Updated 8 years ago
- ParaDnn: A systematic performance analysis methodology for deep learning.☆40Mar 30, 2020Updated 6 years ago
- ☆101Feb 11, 2026Updated 3 months ago
- 一个轻量化的大模型推理框架☆23May 26, 2025Updated 11 months ago
- Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware☆63May 12, 2026Updated last week
- Cloud Native Benchmarking of Foundation Models☆45Jul 31, 2025Updated 9 months ago
- 2023 XFlops Training☆13Jan 23, 2024Updated 2 years ago
- 基于昇腾310芯片的大语言模型部署☆26Jun 14, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆149Mar 5, 2026Updated 2 months ago
- [NSDI25] AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training☆31May 2, 2025Updated last year
- ☆11Feb 5, 2017Updated 9 years ago
- ☆13Jan 16, 2019Updated 7 years ago
- RTL blocks compatible with the Rocket Chip Generator☆17Mar 30, 2025Updated last year
- ☆25Jan 7, 2023Updated 3 years ago
- ☆10Dec 20, 2024Updated last year
- A minimal implementation of vllm.☆72Jul 27, 2024Updated last year
- ☆119May 19, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Word template for a Lancaster University thesis☆11Mar 19, 2022Updated 4 years ago
- [DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"☆116Dec 15, 2025Updated 5 months ago
- Flash Attention in raw Cuda C beating PyTorch☆38May 14, 2024Updated 2 years ago
- ☆13Nov 21, 2024Updated last year
- NVIDIA Inference Xfer Library (NIXL)☆1,030Updated this week
- 我的小窝, 装修全纪录☆11Apr 19, 2021Updated 5 years ago
- graph challenge 2021☆27Jul 9, 2021Updated 4 years ago
- PCB libraries and templates for rocket-chip based FPGA/ASIC designs☆17May 9, 2026Updated last week
- An Alluring, Dark, and Muted Theme For Xcode.☆14Aug 6, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Resources on how to use the HEC at Lancaster University☆14Jan 11, 2022Updated 4 years ago
- ☆10Dec 27, 2020Updated 5 years ago
- 用所有主流语言写一个小说爬虫☆10May 11, 2022Updated 4 years ago
- 该储存库现已移动到“https://github.com/HoneyWhiteCloud/enable-hdr-oneplus13-webui”☆10Aug 30, 2025Updated 8 months ago
- Python module to compute the Mann-Kendall test for trend in time series data☆10Apr 18, 2017Updated 9 years ago
- ☆28Jan 7, 2023Updated 3 years ago
- ☆98Nov 17, 2025Updated 6 months ago