aliyun / SimAI
☆490Updated this week
Alternatives and similar repositories for SimAI:
Users that are interested in SimAI are comparing it to the libraries listed below
- ☆196Updated 4 months ago
- ☆41Updated 6 months ago
- ☆57Updated this week
- ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale☆355Updated this week
- Curated collection of papers in machine learning systems☆325Updated last month
- FlagPerf is an open-source software platform for benchmarking AI chips.☆331Updated this week
- Repository for MLCommons Chakra schema and tools☆96Updated last month
- GLake: optimizing GPU memory management and IO transmission.☆457Updated last month
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆73Updated last year
- A large-scale simulation framework for LLM inference☆371Updated 5 months ago
- ☆287Updated last year
- paper and its code for AI System☆299Updated 3 weeks ago
- Disaggregated serving system for Large Language Models (LLMs).☆580Updated last month
- ☆16Updated 2 months ago
- NCCL Profiling Kit☆133Updated 10 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆120Updated 2 weeks ago
- Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…☆96Updated this week
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆165Updated 6 months ago
- NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer swi…☆297Updated 6 years ago
- An interference-aware scheduler for fine-grained GPU sharing☆133Updated 3 months ago
- Microsoft Collective Communication Library☆345Updated last year
- DeepSeek-V3/R1 inference performance simulator☆117Updated last month
- Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation☆19Updated last month
- Artifacts for our NSDI'23 paper TGS☆75Updated 11 months ago
- ☆178Updated 2 years ago
- A highly optimized LLM inference acceleration engine for Llama and its variants.☆884Updated this week
- LLM serving cluster simulator☆99Updated last year
- ☆136Updated last year
- Dynamic Memory Management for Serving LLMs without PagedAttention☆366Updated 3 weeks ago
- example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory☆129Updated 9 months ago