Optimize vLLM with persistent system prompt caching and block reuse for faster, memory-efficient inference.
☆53Oct 6, 2025Updated 7 months ago
Alternatives and similar repositories for VLLM_PromptCache
Users that are interested in VLLM_PromptCache are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- no☆26Apr 23, 2025Updated last year
- Go bindings for the CUDA Driver and Runtime APIs, cuBLAS, and cuDNN.☆153Dec 24, 2025Updated 4 months ago
- The implementation of RAG-LER☆17Sep 19, 2025Updated 8 months ago
- 利用Python实现的DBMS☆15May 16, 2023Updated 3 years ago
- Source code for SIGGRAPH25 DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data☆62Nov 19, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17Sep 20, 2021Updated 4 years ago
- Efficient controlnet for DiTs☆386May 10, 2025Updated last year
- GraphiContact is a robust method for 3D human reconstruction and contact point prediction from monocular RGB images, utilizing pose-aware…☆51Mar 24, 2026Updated last month
- ☆26Dec 2, 2025Updated 5 months ago
- [AAAI2023] AdapSafe: Adaptive and Safe-Certified Deep Reinforcement Learning-Based Frequency Control for Carbon-neutral Power Systems☆28Feb 19, 2025Updated last year
- kubernetes pod bandwidth rate limiting, setting bandwidth quota & custom-limitrange☆28May 13, 2026Updated last week
- ☆18May 14, 2025Updated last year
- A tool ot export, analyse and visualize your transactions, rewards and commissions of your liquidity mining pools or DEX transactions.☆12Feb 13, 2022Updated 4 years ago
- 基于 ClaudeCode-CLI 源码进行修复完成的项目☆118May 10, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A wearable, a necklace-like medical device using Alpha Wave sound therapy to treat depression in dogs☆15Mar 22, 2024Updated 2 years ago
- AI-powered tools designed to enhance, restore, and personalize your visual content.☆18Mar 7, 2024Updated 2 years ago
- ☆31Jan 27, 2026Updated 3 months ago
- We will send our supply to the Education Foundation after the migrating.☆101May 16, 2025Updated last year
- The repository for 'Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid'☆141May 4, 2025Updated last year
- use sklearn to detect two types of network attacks☆34Jun 6, 2019Updated 6 years ago
- Repository of "Modal-NexT: toward unified heterogeneous cellular data integration"☆85Jun 16, 2025Updated 11 months ago
- React Render for Phoenix Framework☆52Mar 6, 2026Updated 2 months ago
- 自动生成 markdown 标题序号☆27Sep 16, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆22Jan 27, 2022Updated 4 years ago
- ☆68May 16, 2023Updated 3 years ago
- Metrics for Go — lightweight, concurrent-safe, and with built-in support for exporting Counters, Gauges, and Timers to DataDog via DogSta…☆41Jun 8, 2025Updated 11 months ago
- 弹幕系统☆28Dec 4, 2022Updated 3 years ago
- High-performance Go BLAS/LAPACK with Intel MKL/OpenBLAS acceleration support☆46Dec 6, 2025Updated 5 months ago
- Backend for HR Admin Console with Spring Boot☆12Jan 26, 2024Updated 2 years ago
- A fast JSON5 encoder/decoder for Python☆43Apr 16, 2025Updated last year
- ☆370Apr 1, 2026Updated last month
- the pedometer with excitation system☆30Oct 29, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- HTU21D full-featured driver library for general-purpose MCU and Linux.☆47Oct 25, 2025Updated 6 months ago
- EAViz(离线版,在线版参见EAViz-OL)是一款AI赋能的临床级癫痫分析工具,聚焦“算法易用性+临床实用性”,整合脑电/视频多模态数据与深度学习模型,构建“数据输入-模型推理-可视化输出”全链路闭环,打通科研算法与临床用户的使用壁垒,为癫痫诊断、治疗决策及科研工作提供…☆26Mar 24, 2026Updated last month
- ☆509Mar 5, 2026Updated 2 months ago
- lightweight, customizable CSS3 animations, ideal for enhancing web pages and applications.☆13Feb 13, 2024Updated 2 years ago
- 基于go语言开发的长链接服务,基于goroutine对连接进行包装,支持ack消息回执,心跳检测,分布式部署,对外开放rpc、http两种调用模式,提供在线人数统计、对点消息发送、全盘消息发送等多种模式☆25Mar 28, 2023Updated 3 years ago
- [ACL 2025 Findings] The official GitHub repo for the paper "Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomo…☆21May 20, 2025Updated last year
- 手搓云计算运维开发 第一阶段私有云Dashboard 第二阶段CICD☆35Dec 19, 2024Updated last year