finnchen11 / VLLM_PromptCacheLinks
Optimize vLLM with persistent system prompt caching and block reuse for faster, memory-efficient inference.
☆53Updated 3 months ago
Alternatives and similar repositories for VLLM_PromptCache
Users that are interested in VLLM_PromptCache are comparing it to the libraries listed below
Sorting:
- ☆32Updated 2 months ago
- ☆200Updated 3 months ago
- ☆42Updated last month
- Build a complete experiment pipeline for your PyTorch MIP model in 10 seconds.☆86Updated last week
- Advanced Driving Assistance System based on Jetson Nano☆86Updated 6 months ago
- A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…☆343Updated last month
- ☆80Updated 2 weeks ago
- Main Project of AIDE☆91Updated 11 months ago
- 用Hexo的方式管理TypeCho(使用Github Actions自动更新文章到TypeCho)☆83Updated 9 months ago
- ☆42Updated last year
- ☆135Updated last year
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆236Updated 5 months ago
- mini-webui delivers a streamlined AI chat console for teams that need rapid iteration, reliable integrations, and production-ready guardr…☆44Updated last month
- a rather fast time struct getter☆80Updated 6 months ago
- ☆43Updated 5 months ago
- A bibliometric visualization platform that integrates Gestalt design principles, keyword extraction algorithms, temporal algorithms, mach…☆89Updated 2 months ago
- Desktop Tiny Agent is a lightweight, modular desktop intelligent agent framework. It offers plugin extensibility, task scheduling (sync/a…☆80Updated 4 months ago
- ☆135Updated last year
- ☆143Updated last year
- [PRL 2025, APSIPA 2022] Syllable Analysis Data Augmentation (SADA), This project introduces a glyph dictionary and grammar-aware augmenta…☆68Updated 4 months ago
- AI phone agents for business.☆18Updated 11 months ago
- A lightweight and easy-to-use RPC framework created by Bruce Pang☆124Updated 11 months ago
- 4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions☆171Updated last year
- An Integrated Library for Tuning, Deploying and Interpreting Genomic Models☆120Updated 4 months ago
- Code and dataset of ARMOUR: zero-permission sensor usage (ACM WiSec 2025)☆38Updated 7 months ago
- 基于Go的goroutine及Go的并发编程实现的协程复用池☆71Updated 10 months ago
- An Interaction Fiction Demo Powered AI Dungeon☆84Updated 3 months ago
- Learning chatbot that can automatically fetch lecture transcript☆36Updated 5 months ago
- A cloud-native data pipeline and visualization project analyzing Formula 1 racing data using Azure, Databricks, Delta Lake, Tableau, and …☆90Updated 5 months ago
- Repository for the paper:☆69Updated last year