aliyun / SimAI
☆456Updated last week
Alternatives and similar repositories for SimAI:
Users that are interested in SimAI are comparing it to the libraries listed below
- ☆192Updated 2 months ago
- ☆40Updated 4 months ago
- ☆54Updated last month
- ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale☆331Updated last month
- FlagPerf is an open-source software platform for benchmarking AI chips.☆325Updated last month
- Curated collection of papers in machine learning systems☆264Updated 3 weeks ago
- ☆12Updated last month
- ☆279Updated last year
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆217Updated 5 months ago
- Repository for MLCommons Chakra schema and tools☆93Updated last week
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆72Updated last year
- An interference-aware scheduler for fine-grained GPU sharing☆129Updated 2 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆115Updated this week
- LLM serving cluster simulator☆94Updated 11 months ago
- Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…☆73Updated this week
- GLake: optimizing GPU memory management and IO transmission.☆449Updated this week
- Disaggregated serving system for Large Language Models (LLMs).☆507Updated 7 months ago
- paper and its code for AI System☆283Updated 2 months ago
- ☆132Updated last year
- Artifacts for our NSDI'23 paper TGS☆76Updated 9 months ago
- A large-scale simulation framework for LLM inference☆351Updated 4 months ago
- Separate from hardware and used to learn some NCCL mechanisms☆17Updated 11 months ago
- Synthesizer for optimal collective communication algorithms☆106Updated 11 months ago
- NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer swi…☆288Updated 6 years ago
- A low-latency & high-throughput serving engine for LLMs☆327Updated last month
- example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory☆123Updated 7 months ago
- ☆48Updated 9 months ago
- A highly optimized LLM inference acceleration engine for Llama and its variants.☆881Updated last week
- Unified KV Cache Compression Methods for Auto-Regressive Models☆956Updated 2 months ago
- ☆26Updated 10 months ago