vllm-project / vllm-spyreLinks
Community maintained hardware plugin for vLLM on Spyre
☆39Updated this week
Alternatives and similar repositories for vllm-spyre
Users that are interested in vllm-spyre are comparing it to the libraries listed below
Sorting:
- llm-d benchmark scripts and tooling☆41Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆85Updated this week
- A tool to detect infrastructure issues on cloud native AI systems☆52Updated 3 months ago
- MAD (Model Automation and Dashboarding)☆30Updated this week
- ☆13Updated 3 months ago
- ☆24Updated 3 months ago
- Systematic and comprehensive benchmarks for LLM systems.☆47Updated last month
- CloudAI Benchmark Framework☆81Updated this week
- A hierarchical collective communications library with portable optimizations☆37Updated last year
- ☆51Updated 5 months ago
- DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and soft…☆54Updated this week
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆13Updated 9 months ago
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated 3 months ago
- Cloud Native Benchmarking of Foundation Models☆44Updated 5 months ago
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 10 months ago
- ☆56Updated this week
- NVIDIA NCCL Tests for Distributed Training☆132Updated this week
- Ongoing research training transformer models at scale☆35Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 6 months ago
- Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.☆15Updated last month
- Offline optimization of your disaggregated Dynamo graph☆146Updated this week
- A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node☆61Updated 3 weeks ago
- Estimate resources needed to train LLMs☆14Updated last month
- A recommendation model kernel optimizing system☆12Updated 7 months ago
- Scale-out system monitoring☆20Updated last month
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆36Updated 4 months ago
- A Micro-benchmarking Tool for HPC Networks☆33Updated 4 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆203Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆24Updated 8 months ago
- COCCL: Compression and precision co-aware collective communication library☆29Updated 9 months ago