finnchen11 / VLLM_PromptCacheLinks
Optimize vLLM with persistent system prompt caching and block reuse for faster, memory-efficient inference.
☆52Updated last week
Alternatives and similar repositories for VLLM_PromptCache
Users that are interested in VLLM_PromptCache are comparing it to the libraries listed below
Sorting:
- slark is a cross platform player that supports iOS and Android☆91Updated 3 weeks ago
- The code for paper "Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review" accepted by ACL 2025.☆46Updated 3 months ago
- ☆42Updated 4 months ago
- ☆139Updated last year
- Source code of Fuyao, built on Nightcore☆17Updated last year
- Main Project of AIDE☆91Updated 6 months ago
- Advanced Driving Assistance System based on Jetson Nano☆83Updated last month
- Example project using universal links as deeplinks to switch iOS apps.☆13Updated last year
- ☆41Updated last year
- ☆135Updated last year
- Desktop Tiny Agent is a lightweight, modular desktop intelligent agent framework. It offers plugin extensibility, task scheduling (sync/a…☆79Updated last week
- ☆143Updated last year
- 验证码识别☆12Updated 3 years ago
- ☆22Updated 4 months ago
- 这是一个MCP客户端,让你轻松配置各个大模型,对接各种MCP Server而开发。This is an MCP client that allows you to easily configure various large models and develop inter…☆68Updated 2 weeks ago
- A bibliometric visualization platform that integrates Gestalt design principles, keyword extraction algorithms, temporal algorithms, mach…☆89Updated 2 months ago
- ☆12Updated 2 years ago
- ☆42Updated 7 months ago
- CommercialGoatAPI is a commercial project that provides remote HTTP access to Goat API(and alias API) supporting all interfaces of these …☆82Updated last week
- Some of the libraries (docs) on the RISCV64 architecture are easy for users to install and deploy 一些riscv64 架构上面的库☆69Updated this week
- ☆43Updated 3 weeks ago
- excel转为go结构和json(go读取excel)☆40Updated 6 months ago
- A QR-based ordering system for a seamless dining experience. Deploy on docker.☆49Updated 3 weeks ago
- This is an implementation for Streaming Wavelet Module, which sequentially apply wavelet transform to a sequence of signal efficiently.☆120Updated last month
- A lightweight and easy-to-use RPC framework created by Bruce Pang☆125Updated 6 months ago
- Kubernetes Operator for managing OpenResty with custom CRDs (OpenResty, Server, Location, Upstream, RateLimitPolicy)☆52Updated 3 months ago
- Code and dataset of ARMOUR: zero-permission sensor usage (ACM WiSec 2025)☆37Updated 2 months ago
- a rather fast time struct getter☆80Updated last month
- A toolkit that helps you automatically deletes old Docker images from an AWS ECR repository, keeping only the latest N images.☆52Updated 6 months ago
- 以太坊世界杯竞猜项目☆14Updated last year