finnchen11 / VLLM_PromptCacheLinks
Optimize vLLM with persistent system prompt caching and block reuse for faster, memory-efficient inference.
☆53Updated 2 months ago
Alternatives and similar repositories for VLLM_PromptCache
Users that are interested in VLLM_PromptCache are comparing it to the libraries listed below
Sorting:
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆236Updated 4 months ago
- Kubernetes Operator for managing OpenResty with custom CRDs (OpenResty, Server, Location, Upstream, RateLimitPolicy)☆50Updated 7 months ago
- ☆80Updated this week
- Advanced Driving Assistance System based on Jetson Nano☆84Updated 5 months ago
- A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…☆343Updated 2 weeks ago
- Desktop Tiny Agent is a lightweight, modular desktop intelligent agent framework. It offers plugin extensibility, task scheduling (sync/a…☆80Updated 4 months ago
- ☆42Updated last month
- 以太坊世界杯竞猜项目☆14Updated 2 years ago
- ☆135Updated last year
- ☆137Updated last year
- ☆41Updated last year
- a rather fast time struct getter☆80Updated 5 months ago
- https://www.kaggle.com/competitions/sorghum-id-fgvc-9☆19Updated 2 years ago
- Cascade is a production-ready, high-performance, and low-latency audio stream processing library designed for Voice Activity Detection (V…☆83Updated last week
- Main Project of AIDE☆91Updated 10 months ago
- Build a complete experiment pipeline for your PyTorch MIP model in 10 seconds.☆86Updated this week
- ☆32Updated last month
- Example project using universal links as deeplinks to switch iOS apps.☆13Updated last year
- ☆200Updated 2 months ago
- Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.☆38Updated 9 months ago
- ☆22Updated 3 months ago
- ☆324Updated this week
- ☆143Updated last year
- This is a useful development tool that supports mocking for both GraphQL and RESTful APIs.☆22Updated last year
- ☆11Updated 3 years ago
- ☆12Updated 2 years ago
- A bibliometric visualization platform that integrates Gestalt design principles, keyword extraction algorithms, temporal algorithms, mach…☆89Updated 2 months ago
- Code for "FaithLens: Detecting and Explaining Faithfulness Hallucination"☆89Updated last week
- ☆13Updated 3 years ago
- ☆42Updated 11 months ago