vllm-project / vllm-spyreLinks
Community maintained hardware plugin for vLLM on Spyre
☆37Updated this week
Alternatives and similar repositories for vllm-spyre
Users that are interested in vllm-spyre are comparing it to the libraries listed below
Sorting:
- A hierarchical collective communications library with portable optimizations☆36Updated 11 months ago
- llm-d benchmark scripts and tooling☆33Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 2 months ago
- CloudAI Benchmark Framework☆74Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆85Updated last week
- MAD (Model Automation and Dashboarding)☆30Updated last week
- Cloud Native Benchmarking of Foundation Models☆44Updated 4 months ago
- TACOS: [T]opology-[A]ware [Co]llective Algorithm [S]ynthesizer for Distributed Machine Learning☆29Updated 5 months ago
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 9 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆129Updated this week
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆12Updated 7 months ago
- Microsoft Collective Communication Library☆66Updated last year
- Offline optimization of your disaggregated Dynamo graph☆110Updated this week
- DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and soft…☆47Updated 3 weeks ago
- ☆27Updated 3 months ago
- Systematic and comprehensive benchmarks for LLM systems.☆41Updated 2 months ago
- COCCL: Compression and precision co-aware collective communication library☆28Updated 8 months ago
- A Micro-benchmarking Tool for HPC Networks☆33Updated 2 months ago
- A recommendation model kernel optimizing system☆12Updated 5 months ago
- ☆24Updated last month
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆55Updated this week
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated 2 months ago
- Ongoing research training transformer models at scale☆33Updated this week
- NVIDIA NCCL Tests for Distributed Training☆124Updated 2 weeks ago
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆62Updated 2 months ago
- LLM Inference analyzer for different hardware platforms☆96Updated 4 months ago
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆35Updated 3 months ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆29Updated last week
- A validation and profiling tool for AI infrastructure☆350Updated last week
- ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage☆53Updated 2 weeks ago