EMDC-OS / power-aware-tritonLinks
Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving (HPCA '23)
☆13Updated last week
Alternatives and similar repositories for power-aware-triton
Users that are interested in power-aware-triton are comparing it to the libraries listed below
Sorting:
- Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.☆38Updated last week
- ☆9Updated last week
- [ACM EuroSys '23] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Updated last year
- ☆26Updated 4 months ago
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆119Updated last week
- MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters☆19Updated 2 years ago
- ☆49Updated 6 months ago
- ☆23Updated 3 years ago
- ☆70Updated last month
- ☆12Updated 2 months ago
- ☆25Updated last year
- ☆25Updated 2 years ago
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆73Updated last year
- Network Contention-Aware Cluster Scheduling with Reinforcement Learning (IEEE ICPADS 2023)☆16Updated 8 months ago
- ☆42Updated this week
- Repository for MLCommons Chakra schema and tools☆109Updated last week
- Repository for MLCommons Chakra schema and tools☆39Updated last year
- ☆10Updated last week
- [USENIX ATC '21] Exploring the Design Space of Page Management for Multi-Tiered Memory Systems☆47Updated 3 years ago
- LLM serving cluster simulator☆106Updated last year
- Dotfile management with bare git☆19Updated last month
- ☆100Updated last year
- LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks☆15Updated 3 years ago
- ☆37Updated this week
- A Cycle-level simulator for M2NDP☆28Updated last month
- Justitia provides RDMA isolation between applications with diverse requirements.☆40Updated 3 years ago
- ☆23Updated 2 years ago
- ☆36Updated last year
- ☆16Updated 4 months ago
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…☆94Updated 2 years ago