Prefix-Aware Attention for LLM Decoding
☆35Mar 31, 2026Updated last week
Alternatives and similar repositories for PAT
Users that are interested in PAT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Open-Source RAG Workload Trace to Optimize RAG Serving Systems☆36Nov 18, 2025Updated 4 months ago
- ☆12Dec 1, 2023Updated 2 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- ☆19Feb 13, 2026Updated last month
- A simple demo for using Sentinel with Spring Cloud Alibaba☆16Nov 8, 2018Updated 7 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An ultra-fast, distributed Safetensors loader☆34Mar 26, 2026Updated 2 weeks ago
- 北京邮电大学网络工程嵌入式系统实验报告☆12Jan 7, 2021Updated 5 years ago
- Important experiments on memory management, file access, network transfer, job scheduler, and so on.☆15Apr 27, 2022Updated 3 years ago
- BUPT神经网络与深度学习课设☆10Dec 29, 2023Updated 2 years ago
- [AFK] Hardware router in Chisel (THU Network Joint Lab 2020)☆14Oct 8, 2020Updated 5 years ago
- alibaba/Sentinel zuul integration sample☆11Oct 20, 2018Updated 7 years ago
- 波普特酒店空调管理系统☆14Jun 14, 2020Updated 5 years ago
- bupt nlp第二次作业:分别基于SVD分解以及基于SGNS两种方法构建汉语子词向量并进行评测☆10May 16, 2023Updated 2 years ago
- ☆99Jan 22, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆21Jun 1, 2025Updated 10 months ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- 2021级BUPT深度学习与神经网络课程设计源代码☆15Jan 21, 2024Updated 2 years ago
- BUPT Software Engineering Project☆18Aug 20, 2018Updated 7 years ago
- ☆17May 10, 2024Updated last year
- BATCH: Adaptive Batching for Efficient MachineLearning Serving on Serverless Platforms☆11Aug 7, 2021Updated 4 years ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆40Feb 29, 2024Updated 2 years ago
- ☆36Dec 9, 2025Updated 4 months ago
- COSCon Workshop on ECharts☆18Oct 18, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Research prototype of PRISM — a cost-efficient multi-LLM serving system with flexible time- and space-based GPU sharing.☆59Mar 17, 2026Updated 3 weeks ago
- ☆15Aug 15, 2024Updated last year
- ☆42Oct 11, 2025Updated 5 months ago
- ☆16Apr 22, 2025Updated 11 months ago
- 训练营训练方向项目☆26Jan 28, 2026Updated 2 months ago
- This repo is used to assess NSL's scientific research assistants.☆18Jul 7, 2025Updated 9 months ago
- ☆19Jun 3, 2023Updated 2 years ago
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated last year
- Arya: Arbitrary Graph Pattern Mining with Decomposition-based Sampling☆16Sep 27, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- [ICML2024] Adaptive Text Watermark for Large Language Models☆25Dec 11, 2024Updated last year
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving☆77Sep 15, 2025Updated 6 months ago
- ☆33Updated this week
- BytePS examples (Vision, NLP, GAN, etc)☆19Nov 24, 2022Updated 3 years ago
- Spring Cloud Alibaba, Dubbo, Alibaba Cloud, and more.☆33Nov 16, 2018Updated 7 years ago
- This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …☆24Oct 20, 2024Updated last year
- A Streaming-Native Serving Engine for TTS/STS Models☆62Updated this week