Prefix-Aware Attention for LLM Decoding
☆39May 26, 2026Updated 2 weeks ago
Alternatives and similar repositories for PAT
Users that are interested in PAT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Open-Source RAG Workload Trace to Optimize RAG Serving Systems☆36Nov 18, 2025Updated 6 months ago
- ☆12Dec 1, 2023Updated 2 years ago
- 上海交通大学软件学院课程计算机系统基础(ICS)笔记☆15Feb 7, 2022Updated 4 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 3 years ago
- A simple demo for using Sentinel with Spring Cloud Alibaba☆17Nov 8, 2018Updated 7 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆26Oct 1, 2025Updated 8 months ago
- A mini version of k8s that implements the abstraction of pod, service, auto-scaling, replicaSet and provides DNS, GPU and serverless serv…☆17Jun 16, 2023Updated 2 years ago
- 北京邮电大学网络工程嵌入式系统实验报告☆12Jan 7, 2021Updated 5 years ago
- Important experiments on memory management, file access, network transfer, job scheduler, and so on.☆15Apr 27, 2022Updated 4 years ago
- BUPT神经网络与深度学习课设☆10Dec 29, 2023Updated 2 years ago
- [AFK] Hardware router in Chisel (THU Network Joint Lab 2020)☆14Oct 8, 2020Updated 5 years ago
- UBio-MolFM is a foundation model suite for molecular modeling, developed by the UBio-MolFM team.☆31Apr 13, 2026Updated last month
- alibaba/Sentinel zuul integration sample☆11Oct 20, 2018Updated 7 years ago
- An ultra-fast, distributed Safetensors loader☆57May 27, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 波普特酒店空调管理系统☆14Jun 14, 2020Updated 5 years ago
- ☆46Dec 19, 2025Updated 5 months ago
- A LaTeX template provides a beautiful design of class schedule with colorful course blocks.☆14May 13, 2026Updated 3 weeks ago
- ☆121Apr 23, 2026Updated last month
- ☆22Jun 1, 2025Updated last year
- 2021级BUPT深度学习与神经网络课程设计源代码☆15Jan 21, 2024Updated 2 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- BUPT Software Engineering Project☆18Aug 20, 2018Updated 7 years ago
- ☆17May 10, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 上海交通大学软件学院课程《应用系统体系架构》(SE3353)笔记☆11Feb 2, 2024Updated 2 years ago
- 水源社区 API client☆17Dec 11, 2023Updated 2 years ago
- BATCH: Adaptive Batching for Efficient MachineLearning Serving on Serverless Platforms☆11Aug 7, 2021Updated 4 years ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆39Feb 29, 2024Updated 2 years ago
- ☆42Dec 9, 2025Updated 6 months ago
- COSCon Workshop on ECharts☆18Oct 18, 2018Updated 7 years ago
- 用AI从0开始制作“研究生模拟器”小游戏☆44Apr 29, 2026Updated last month
- Research prototype of PRISM — a cost-efficient multi-LLM serving system with flexible time- and space-based GPU sharing.☆68Mar 17, 2026Updated 2 months ago
- A library for working with Dorico's Remote Control API☆15Feb 12, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- FreeRTOS with Earlier Deadline First( EDF ) task scheduling.☆12Jul 13, 2017Updated 8 years ago
- ☆15Aug 15, 2024Updated last year
- ☆44Oct 11, 2025Updated 7 months ago
- ☆19Jun 1, 2026Updated last week
- 训练营训练方向项目☆27Jan 28, 2026Updated 4 months ago
- Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores (EuroSys'25)☆15Jul 17, 2025Updated 10 months ago
- 破解VMM限制,仅供学习研究,请勿用于商业用途!☆24Jun 12, 2025Updated 11 months ago